Testing ChatGPT for cell type recognition

Link post

Biologists (including myself) often need to identify types of cells based on their gene expression. For example, if I’m differentiating stem cells to make an ovarian organoid, and I perform single cell RNA sequencing, I might want to check the data to see which ovarian cell types are present.

Today, a Nature Methods paper reported good results with giving GPT-4 a list of cell-specific genes and asking it to identify the cell type. This seems interesting, and also quite easy to check for myself to see if it actually works.

My test:

I don’t pay for access to GPT-4, but I gave ChatGPT a test using the prompt from the Nature Methods paper, with the following cell markers:

Identify types of human cells using the following marker genes. Identify one cell type for each row. Only provide the cell type name.

SOX17, POU5F1, NANOS3, PRDM1, NANOG, CD38

POU5F1, SOX2, KLF4, ITGA6, NANOG

SOX17, FOXA2, CXCR4, GATA4

FOXL2, AMHR2, CD82, NR5A1, FSHR, GATA4

ZP3, DPPA3, DDX4, NPM2, ZP2

FOXL2, FSHB, NR5A1, PITX1, GNRHR

STK31, ZBTB16, DDX4, SSEA4, NANOS2

NR2F2, CYP17A1, STAR, LHCGR, GLI1, HSD3B

CASC3, PGAP1, SLC6A16, CNTNAP4, NPHP1

SYCP1, TEX12, REC8, SPO11, SYCP3

NR5A1, SOX9, FSHR, GATA4

OTX2, SOX1, TUBB3, PAX6

ChatGPT’s response:

  1. Endoderm cell

  2. Induced pluripotent stem cell (iPSC)

  3. Definitive endoderm cell

  4. Granulosa cell

  5. Oocyte

  6. Granulosa cell

  7. Spermatogonial stem cell

  8. Theca cell

  9. Sertoli cell

  10. Spermatocyte

  11. Leydig cell

  12. Neuroepithelial cell

Results:

  1. Incorrect, this is a primordial germ cell. This should have been an easy one! SOX17 does overlap with endoderm though.

  2. Correct

  3. Correct

  4. Correct

  5. Correct

  6. Incorrect, this is a pituitary gonadotroph. Although it does share FOXL2 and NR5A1 expression with granulosa cells, FSHB and GNRHR are very specific to the pituitary.

  7. Correct

  8. Correct

  9. Very incorrect. This is just a list of random genes! ChatGPT seemed to be biased by me giving it other reproductive cell types.

  10. Sort of correct — these are just markers for meiosis.

  11. Incorrect, this is a Sertoli cell. Leydig cells lack SOX9 expression.

  12. Basically correct, although I would have called it “neural ectoderm”.

Overall score: 7.5 /​ 12

Let’s test again:

In the first test, ChatGPT got the random genes completely wrong. Let’s prompt it to announce that it’s uncertain if it doesn’t actually know the cell type.

Identify types of human cells using the following marker genes. Identify one cell type for each row. Only provide the cell type name. If you are uncertain, respond “unknown” instead of providing a cell type name.

SOX17, POU5F1, NANOS3, PRDM1, NANOG, CD38

POU5F1, SOX2, KLF4, ITGA6, NANOG

SOX17, FOXA2, CXCR4, GATA4

FOXL2, AMHR2, CD82, NR5A1, FSHR, GATA4

ZP3, DPPA3, DDX4, NPM2, ZP2

FOXL2, FSHB, NR5A1, PITX1, GNRHR

STK31, ZBTB16, DDX4, SSEA4, NANOS2

NR2F2, CYP17A1, STAR, LHCGR, GLI1, HSD3B

CASC3, PGAP1, SLC6A16, CNTNAP4, NPHP1

SYCP1, TEX12, REC8, SPO11, SYCP3

NR5A1, SOX9, FSHR, GATA4

OTX2, SOX1, TUBB3, PAX6

ChatGPT’s response (and my scoring):

  1. Endoderm (still incorrect!)

  2. Embryonic stem cell (correct, although it said iPSC before)

  3. Definitive endoderm (correct)

  4. Granulosa cell (correct)

  5. Oocyte (correct)

  6. Gonadotroph (correct)

  7. Spermatogonium (largely correct, but “spermatogonial stem cell” is better)

  8. Theca cell (correct)

  9. Sertoli cell (INCORRECT, it’s still thrown off by random genes!)

  10. Spermatocyte (sort of correct, same as before)

  11. Sertoli cell (correct this time!)

  12. Neural progenitor cell (again, basically correct but not what I would have called it)

This time it performs better (I’d give it 9.5/​12), but it’s still tricked by random genes, and it still can’t recognize primordial germ cells.

Let’s try with only random genes:

Identify types of human cells using the following marker genes. Identify one cell type for each row. Only provide the cell type name. If you are uncertain, respond “unknown” instead of providing a cell type name.

CASC3, PGAP1, SLC6A16, CNTNAP4, NPHP1

IL9, SLC30A4, SX18P8, CHRDL2, SMUG1P1

HCST, EXOSC8, ORC3, CIDECP2, DNM2

DTL, U3, DDX28, WDFY3, PPP1R2P4

LTK, STK32C, SMIM9, DPPA3P10, MTCO1P12

This time, ChatGPT just responded “unknown” for everything. Very good! Without the prompt to respond “unknown”, ChatGPT instead made wild guesses:

  1. Renal tubular epithelial cell

  2. Retinal pigment epithelial cell

  3. Natural killer cell

  4. Keratinocyte

  5. Spermatocyte

Conclusions

ChatGPT is remarkably good at identifying most cell types, but can be overconfident and assign a cell type to a list of random genes. There also seems to be some bias in this: ChatGPT said the random gene list was Sertoli cells in context of the larger list of reproductive cell types, but when given five lists of completely random genes, it said “unknown” for all of them. Giving the option to respond “unknown” was very important, since otherwise the main outcome was “bovine fecal cells”.

I still don’t trust ChatGPT enough to use for my research, but it will be interesting to see if this improves over time. Also, if any readers can try my prompts with GPT-4, please post the results in the comments!