Ketaki Ghatole

Building at the intersection of AI & Biology

I'm a Computational Biologist by training from Carnegie Mellon, with experience in the biotech industry and academic research labs. My work spans bioinformatics, machine learning, and product development, where I’ve built computational tools that support scientific teams in real workflows. I also write about advances in AI for biotech, with a focus on practical applications scientific teams can evaluate, adapt, and apply to their own work.

Right now I'm exploring how biotech teams can preserve context, reason across fragmented information, and make better decisions from the knowledge they already have. If you're building in this space or just want to chat, feel free to reach out!

To know more ↓
Education

M.S. Computational Biology

Carnegie Mellon University
Swartz Fellow · 2021

B.E. Biotechnology

M.S. Ramaiah Institute of Technology
Co-founder · Numera Math & Logic Club Best Project Award2020
Experience
Nov 2025 – June 2026
Bioinformatics Product Manager
Creyon Bio

Launched an AI-driven target discovery tool that cut scientific evaluation time by ~3 hours per target. Drove company-wide responsible GenAI adoption by identifying high-value use cases and establishing best practices.

Aug 2023 – Oct 2025
Bioinformatics Data Analyst
Creyon Bio

Built and maintained scalable Bioinformatic pipelines for large-scale genomic data analysis. Delivered end-to-end automated assay workflows reducing analysis time by 50% per dataset.

June 2023 - Aug 2023
Computational Biologist
Jefferson Health

Performed differential gene expression and GSEA to uncover biological pathways for genes of interest and built visualization dashboards to communicate computational findings to wet-lab collaborators.

June 2022 - Aug 2022
Computational Science Intern
Moderna

Implemented an eCLIP data ingestion and analysis system with automated preprocessing, peak calling, and motif extraction with a user-facing front end to explore RNA binding proteins.

Jan 2022 - May 2022
Graduate Research Assistant · Schwartz Lab & Teaching Assistant
Carnegie Mellon University

Research in the Schwartz Lab on computational biology problems; teaching assistant for graduate-level computational biology coursework.

Oct 2020 – Mar 2021
Junior Analyst
Axiom Healthcare Strategies

Analyzed clinical trial results, press releases, and conference proceedings to forecast competitive trends and strategic opportunities in the Oncology sector

Product · Full-stack

Clinical Trial Finder

Clinical trial search is difficult to navigate when patients need options relevant to a specific disease context. I built a full-stack lung cancer trial matching app with backend services, a frontend interface, and Supabase to make trial discovery easier to search, filter, and understand around patient-specific needs.
View code →
AI · Knowledge Graph

Bioweave

Biological relationships across genes, proteins, and publications are often scattered across separate databases. Bioweave is an interactive knowledge graph connecting these fragmented resources to make biological context easier to search and visualize.
View code →
Data Infrastructure

Bio Data Organizer

Biological datasets often start as scattered files and semi-structured outputs, making analysis hard to reproduce. This is a prototype workflow using schema-driven organization and AI to automatically structure experimental data into cleaner analysis-ready formats.
View code →
Machine Learning

Multi-omic Lung Cancer Analysis

A multi-omic workflow for lung cancer subtype classification using TCGA gene expression, CNV, and clinical data. I built a PyTorch autoencoder for unsupervised feature selection and the multi-omic ensemble outperformed mono-omic models.
View code →
Machine Learning

Malaria Detector

Manual blood-smear screening can be slow and inconsistent, especially in low-resource settings. I built a ResNet50 transfer-learning classifier in PyTorch to distinguish infected from healthy cells.
View code →
Bioinformatics

Variant Calling Pipeline

A reproducible workflow for calling SNPs and indels from raw paired-end reads. The pipeline outputs filtered variant tables for downstream analysis and IGV visualization.
View code →
Bioinformatics

RBP Peak Finder

A computational workflow for identifying where host RNA-binding proteins may interact with viral RNA. Built with eCLIP data, the pipeline first identifies mapped-read peaks, then uses a CNN to distinguish likely binding peaks from background sequences and highlight potential binding regions.
View code →
Writing
Let's build something

Working on AI × biology?
Let's talk.