CV
Basics
| Name | Charvi Jain |
| Label | AI Researcher · Knowledge-Grounded LLMs · Generative AI · Agentic Scientific Discovery |
| charvi.jain@tu-dresden.de | |
| Summary | PhD researcher at TU Dresden specializing in knowledge-grounded large language models and agentic systems for autonomous scientific discovery. |
Education
-
2024.02 - 2027.01 Germany
Ph.D. in Knowledge Integration in Large Language Models for Systems Biology
School of Embedded Composite Artificial Intelligence, TU Dresden
- LLMs
- Agentic AI
- RAG
-
2019.10 - 2022.01 Germany
M.Sc. in Computational Modeling and Simulation
TU Dresden
- Deep Learning
- Computer Vision
- Knowledge Graphs
- Data Analytics
-
2015.07 - 2019.05 India
B.Tech in Computer Science and Engineering
Indian Institute of Information Technology, Una, HP
- Machine Learning
- Data Structures & Algorithms
- Database Management Systems
Work
-
2024.10 - Present Germany
Graduate Lecturer · Conversational AI
TU Dresden — Chair of Conversational AI
- Designed and delivered graduate-level curriculum spanning transformer architectures, dialogue systems, RAG, LLM evaluation, and agentic AI
- Supervised and assessed cohorts of 10 students (2024–25) and 14 students (2025–26)
-
2024.02 - Present Germany
PhD Researcher · Knowledge-Grounded LLMs & AI Scientist Evaluation
TU Dresden — Faculty of Computer Science
Supervised by Prof. Jens Lehmann (Principal Scientist, Amazon) and Prof. Ivo Sbalzarini.
- Investigating how conversational AI systems can dynamically retrieve and integrate external scientific knowledge with parametric knowledge to produce long-form research reports.
- Designing benchmarks and evaluation frameworks for agentic LLM systems generating open-ended scientific outputs — addressing the breakdown of traditional fixed-metric evaluation under unbounded generative AI.
-
2022.01 - 2024.01 Germany
LLM Research Engineer · Pre-training & Knowledge Integration
Fraunhofer IAIS, Dresden
- Pre-trained Teuken-7B-Base and Teuken-7B-Instruct — decoder LLMs at 7B parameters from scratch on A100 GPU clusters, trained on a corpus with ~60% non-English data and a custom multilingual tokenizer supporting all 24 official EU languages (OpenGPTX, pan-European AI initiative)
- Co-conducted the first systematic ablation of tokenizer design choices across 24 mono- and multilingual LLM variants at 2.6B parameter scale over diverse multilingual corpora; accepted at NAACL 2024
- Engineered a retrieval-augmented generation (RAG) pipeline integrating structured knowledge graphs with unstructured scientific text, improving factual grounding and reducing hallucinations in LLM outputs
- Researched alignment between structured knowledge graph representations and LLM latent spaces to enable controllable, knowledge-grounded text generation
Projects
- 2021.04 - 2021.12
Synthetic Relational Medical Databases via GANs
Master Thesis — Chair of Medical Informatics and Biometry, TU Dresden. Supervised by Prof. Dr. Martin Sedlmayr.
- Designed and trained GANs to generate synthetic multi-table relational medical databases, tackling the hard problem of preserving referential integrity across linked clinical tables
- Evaluated synthetic data fidelity using statistical divergence metrics and ML efficacy scores (train-on-synthetic, test-on-real)
- Addressed privacy-safe medical data sharing — an acute challenge for ML research in healthcare
- 2020.10 - 2021.03
Privacy-Preserving Aneurysm Rupture Prediction
ScaDS.AI, Leipzig University / TU Dresden. Supervised by Ms. Maja Schneider & Prof. Dr. Erhard Rahm.
- Applied Differentially Private GANs (DP-GANs) to generate synthetic single-table clinical datasets with formal ε-differential privacy guarantees
- Benchmarked downstream model utility across a range of privacy budgets to characterise the fundamental privacy-utility trade-off in clinical ML
- 2020.09 - 2021.01
Visual Feature Detection for Surgical SLAM
National Center for Tumor Diseases (NCT), Dresden. Supervised by Mr. Reuben Docea.
- Benchmarked classical (Harris Corner, Lucas-Kanade) and deep learning (SuperPoint, KeyPointNet) feature detectors for surgical scene understanding in endoscopic video
- Evaluated SLAM pipeline robustness under clinically relevant degradation: occlusion, tissue deformation, and low-texture surfaces
- 2018.08 - 2019.03
Image Captioning with CNN-LSTM
Computer Science Department, NIT Hamirpur. Supervised by Dr. Neha Sharma. Published at ICAEECI 2019.
- Built an end-to-end visual language generation pipeline: ResNet CNN encoder for image features, LSTM decoder for caption generation
- Investigated attention mechanisms and beam search decoding for improved caption diversity and fluency; published at ICAEECI 2019
Internships
-
2021.05 - 2021.12 Germany
Working Student — AI/ML Engineering
Fraunhofer IAIS, Dresden
Supervised by Dr. Diego Collarana.
- Built content-based filtering recommender systems leveraging knowledge graph entity embeddings for semantic item representation
- Integrated structured knowledge sources into recommendation pipelines to improve cold-start coverage and recommendation diversity
-
2020.09 - 2020.10 Germany
Software Engineer Intern — Mobile
manaTec, Dresden
Supervised by Mr. Robert Dukstein.
- Delivered a cross-platform mobile application in Flutter/Dart targeting Android and iOS from a single codebase
- Implemented REST API integration, reactive state management, and responsive UI
-
2020.01 - 2020.08 Germany
Working Student — AR & Mobile Development
Institute of Railway Vehicles and Railway Technology, TU Dresden
Supervised by DR.-ING. Martin Kache & Karim Benabdellah.
- Developed an Augmented Reality mobile application using Unity, C#, and Vuforia for real-time 3D technical visualization on Android and iOS
- Enabled field engineers to overlay CAD models onto physical railway components for inspection and maintenance workflows
-
2018.05 - 2018.06 India
Research Intern — Computer Vision
Raman Lab, MNIT Jaipur
Supervised by Prof. Rajesh Kumar.
- Trained and benchmarked ML classifiers (SVM, CNN, Random Forest) for 6-class facial emotion recognition from still images
- Published findings at IEEE ICRAIE 2018
Hackathons
-
2023.05 FZJ Jülich, Germany
Helmholtz GPU Hackathon
Selected participant in a competitive GPU computing hackathon hosted by Forschungszentrum Jülich.
- Investigated the efficiency of tensor parallelism in PyTorch 2.0 with respect to Fully Sharded Data Parallel (FSDP) on multi-GPU HPC clusters
- Profiled and compared distributed training strategies for large-scale model training workloads
-
2019.12 TU Dresden, Germany
Game Jam — Best Innovative Team Award
48-hour game jam; awarded Best Innovative Team.
- Built SuddenlyAR — an Augmented Reality quest game using Unity, Blender, and Vuforia within 48 hours
- Designed and implemented AR marker-based interaction, 3D asset pipeline, and game logic from scratch
-
2017.04 NIT Hamirpur, India
Hackathon 2.0 — 1st Place
Won 1st place out of competing teams.
- Built Easy Outpass — an RFID-based automated campus gate-pass system with an Android front-end and MySQL backend
- Delivered end-to-end hardware-software integration within the hackathon timeframe
Skills
| AI / ML Frameworks | |
| PyTorch | |
| HuggingFace Transformers | |
| HuggingFace Datasets & Accelerate | |
| Scikit-learn | |
| LangChain |
| LLM & Generative AI | |
| LLM Pre-training & Fine-tuning | |
| Retrieval-Augmented Generation (RAG) | |
| Knowledge Graph Integration | |
| Evaluation & Benchmarking | |
| Agentic Workflows |
| Languages & Tools | |
| Python | |
| C++ | |
| Java | |
| Bash / Unix | |
| SQL | |
| Git | |
| Docker |
| HPC & Distributed Training | |
| SLURM | |
| FSDP / Tensor Parallelism | |
| A100 GPU Clusters | |
| Taurus (TU Dresden) | |
| Juwels Booster (FZJ Jülich) |
Awards
- 2021
Virtual Grace Hopper Conference EMEA Scholar
AnitaB.org
Competitive scholarship for women in computing, selected for the EMEA region cohort
- 2019.12
Best Innovative Team — Game Jam
TU Dresden
Built SuddenlyAR, an Augmented Reality quest game, using Unity, Blender, and Vuforia
- 2017.04
1st Place — Hackathon 2.0
NIT Hamirpur
Built Easy Outpass — RFID-based campus access Android app with MySQL backend
- 2015
Indra Priyadarshini Puruskar
National merit award recognising academic excellence; prize of INR 1,00,000 (~USD $1,538)
- 2015
Gargi Award — Balika Prothsahan Puruskar
Government of Rajasthan
State government merit scholarship for outstanding secondary examination results
- 2013
SSTSE Scholar — State Science Talent Search Examination
Board of Secondary Education, Govt. of Rajasthan
Selected among top state-level science students in Rajasthan
Certificates
| MLx Generative AI — Oxford Machine Learning Summer School (OxML) | ||
| AI for Global Goals & University of Oxford | 2025-06 |
| Oracle Database 11g — SQL Fundamentals I | ||
| Oracle | 2017 |
Languages
| English | |
| C2 — TOEFL 100 · GRE 315 (V: 150, Q: 165) |
| German | |
| B1 — Goethe B1 Certificate |
| Hindi | |
| Native |