This deals with code.

Textbooks are All You Need: Inside Microsoft Research’s Amazing Phi-1 Code Language Model
304
7
Jesus Rodriguez
Peter Burns
·Follow
Jul 3, 2023
--
This deals with code. I was wondering whether this could be applied to other types of data. LLMs are trained on a lot of junk, and hence you have many problems like biases, misinformation, disinformation, etc. However, if you clean up the training corpus (get rid of disinformation website and the like), would the effect be the same as in this study (which was done on code)?
--
--
Written by Peter Burns6.5K Followers
·308 Following
A curious polymath who wants to know how everything works. Blog: Renaissance Man Journal (http://gainweightjournal.com/).
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams