#acl All:read The ideal computational biology graduate student knows biology very well, computer science very well (including being able to program very well), math/statistics very well and working knowledge of engineering, chemistry and physics. However, it is rare that someone has this background before graduate school. The following are case studies of former UofT graduate students in computational biology (bioinformatics), the courses they took and their experiences and recommendations. Hopefully, this will be useful for new computational biology graduate students at UofT when they are choosing their courses. == Student 1 == * Background: Biology and chemistry * Ph.D. Biochemistry Sep.1999-Mar.2005 (5.5 year completion time) * Took the following extra copmuter science courses: * CSC108 Intro Comp. Prog. Fall year 1 * CSC148 Intro Com Sci Winter year 1 * CSC270 Fund Data Str.& Tech Summer year 1 * CSC228 File Struct Data Man. Winter year 2 * CSC238 Disc. Math Comp. Sci. Winter year 3 * CSC364 Computability& Complexity Winter year 4 * CSC2506 Probalistic Reason. Winter year 5 * Student recommends taking no more than one extra course per semester (extra courses are in addition to the graduate courses). Finishing the course requirements for the degree should take some priority over extra courses. * Student written description of experience: {{{ My undergraduate degree was a combined major in biology and chemistry that gave me a solid footing in biochemistry, molecular biology, genetics and organic chemistry. My computer science training followed two parallel tracks: self-taught and formal courses. Self-learning began with basic programming skills in C and UNIX working environment. I later learned C++, Perl and MatLab. In retrospect I would not recommend learning to program in C. Java is by far a better choice for first-time programmers but I would strongly encourage learning C/C++ at some point. As one instructor told us once “C may not be your first choice for a project but it will always be the second best”. There is no substitute for experience. If your lab has experienced programmers or theoreticians take advantage of their accumulated experience and ask questions, lots of questions. You would often find that asking a simple question can save you days, if not weeks, of frustrating debugging or help you circumvent problems that have been solved many times over. There is no better way to learn than by example. Look at other people’s code. As I started my graduate work it quickly became apparent that although I was programming there were large gaps in my understanding of theoretic concepts, which impeded my work. So I decided to go back to ‘school’ and fill those gaps with undergraduate courses. I began with introductory to computer science (CS148) followed by fundamental data structures and algorithm (CS270), file structure and data managements (CS228) and discrete math (CS238). Towards the end of my graduate degree I also took courses in computational complexity and computability (CS364) and a graduate level course in probabilistic reasoning (CS2506). There are a number of things to keep in mind when starting this process. * Start early. The sooner you learn the more effective and rewarding will be your own research work. * Inform your supervisor that you are taking additional courses and to expect a substantial amount of your time to be dedicated to these courses. * Do not audit the courses, register and do the course work. There are three reasons for that, many of the concepts are not easy to grasp and only through assignments and exams do they solidify in your mind. Second, most people that audit courses rarely attend all classes or do any of the work so they do not learn all the topics covered in the course. Finally, the courses will appear in your transcript so you may be able (although with considerable struggle in my case) to convince your graduate coordinator to credit some of these courses in lieu of other degree requirements. * You don’t have to choose courses that are related to your graduate research. Pick courses that spark your interest. You’ll enjoy them more. There is no wasted knowledge; anything you learn may be of practical use at some point. Today most bioinformatics programs include computer science courses so many of the student are better prepared to tackle their graduate work. However, if you find a ‘hole’ in your knowledge take the initiative and fill it up. }}} == Student 2 == * Background: Biochemistry and computer science * Ph.D. Biochemistry Sep.1998 to Oct.2002 (4 year completion time) * Took the following courses: * BCH1421 and BCH1422 during Fall year 1 (conditional acceptance until those courses were taken, to satisfy entry requirements - not enough biochemistry courses were taken in undergraduate). Somewhat duplicated knowledge learned in undergraduate studies, but was a good learning experience, especially the essay component, since scientific writing was not taught well in undergraduate studies. * BCH2021S in Winter year 1 - selected topic of course was bioinformatics. Course satisfied Ph.D. requirements and was taken out of interest. Was an excellent learning experience. * Audit MBP1011S in Winter year 1 to evaluate taking course for next year. It is not recommended to audit a course if you will not take it for credit in the future, since you need to be extraordinarily disciplined to attend all lectures. * MBP1011S in Winter year 2. Was an excellent learning experience. * BCH2021S in Winter of year 3 - selected topic of course was protein-protein interactions. Course satisfied Ph.D. requirements and was taken out of interest. Was an excellent learning experience. == Student 3 == * Background: Human Biology and computer science * M.Sc. Biochemistry Sep.2002 to Jan.2005 (2.5 year completion time) * Took the following courses during M.Sc.: * BCH1421S (Protein Structure and Function) and BCH1422F (Cell surface biochemistry) during year 1 (conditional acceptance until those courses were taken, to satisfy entry requirements - had a human biology major, not enough biochemistry so I hadn't taken enough biochemistry courses during undergraduate). * BCH2021F in Fall year 1 - selected topic of course was Transmembrane and Intracellular Signalling. Course satisfied M.Sc. requirements. There were different lecturers each class who were experts in their field. Challenging but informative course. * I felt the mixture of a computer science and human biology undergraduate gave me a solid base for research in bioinformatics. The one thing that I regret not focusing on during my undergraduate degree was statistics. No matter whether you focus on biology, computers or a mixture of the two I found the statistics requirements of my undergraduate didn't properly prepare me for my M.Sc. * Depending on what becomes your focus during a graduate degree there will always be topics that are new that you know very little about. I found that one of the main goals of a graduate degree is to expand your current knowledge. Besides taking courses you have to read papers, books and develop good research skills in order to get the most out of the experience. == Student 4 == * Background: Biomedical/Electrical Engineering, Engineering Science * Ph.D. Biochemistry May 1998-June 2003 (5 year completion time) * Took the following courses during Ph.D.: * BCH2021S in Winter year 1 - selected topic of course was bioinformatics. * MBP1011S in Winter year 2. * X-ray crystallography * Numerical methods (audited) * Statistics for life sciences * NMR methods - too specialized, so left the course == Student 5 == * Background: Biochemistry * Ph.D. Biochemistry Sep.1999 to Aug.2004 (5 year completion time) * Participated in the Biomolecular Structure Program (although didn't formally enroll in it). * Took the following courses within the first 2 years: * JBB1425 - Nice introductory course in experimental methods for structural biology. Lots of material to cover in great detail. Was required for me to take JBB2026H. * JBB2026 - Protein Structure, Folding and Design. Great course on a variety of interesting topics by a good cross section of the faculty in biochemistry. This easily complimented my prior undergraduate experience and gave me more insight into computational approaches. * JBB2025 - Protein Crystallography - Excellent course to truly understand crystallography from first principles to how they do it today. Hard material if you're not particularly strong in the math section. * MBP1011 - Foundations in Bioinformatics. Fantastic course, especially Gil Prive's section. I really learnt a lot in terms of practical approaches, and gave me my first taste of bioinformatics. I was also able to take this in lieu of BCH2021. * Student written description of experience: {{{ After having spent 16 frustrating months in the wet lab, I left to work on a bioinformatics project and haven't looked back since. I tried taking some computer science courses, but I was insanely bored because the pace was far too slow. The abstract nature of these courses was a major turn off. Much better was to pursue a well defined project with the use of online resources (read: Google) for reference and code examples to get the job done. Having access to other students who have expertise in programming was a major asset for my success. I first started by learning C, which was not easy, but having since developed a firm understanding and capability with this language has extended into all other programming languages. My current programming language is PHP, a web scripting language that looks like C, can be programmed like C++/Java and reduces the application development time by orders of magnitude. Learning how to use SQL databases like MySQL effectively was also self-learnt, and an asset to almost all projects. Programming is not hard, but like anything, passion, persistence and practice strengthens the ability, and pays off in dividends of efficiency. }}} == Student 6 == * Background: B.Sc. and M.Sc. Human Physiology + bioinformatics diploma * Student description of experience: * JTB2010H - Proteomics and functional genomics - taken fall 2005 - This course centered around discussing journal articles published in the field of proteomics and bioinformatics. Every week we were assigned an article to read and an author who had collaborated in the publication would be there in the subsequent class to lead the discussion and give a presentation on their research. A warning to CS students, the articles reviewed in this class tended to have a strong (and sometimes complicated) biochemistry component. Even with a background in biology I had to put in extra reading to understand the methodology. Assignments involved creating and evaluating mock-grants. Overall I thought this course was outstanding and I would recommend it to anyone new to the field of bioinformatics. It was a great way not only to get a sense of what projects are being undertaken at U of T, but also to interact directly with most of the faculty. Grant writing skills are obviously also important so it was good to gain experience there as well. * CSC2427H - Algorithms in molecular biology - taken winter 2006 - This course gave an overview of computational techniques used in both bioinformatics and computational biology such as sequence alignment, network analysis and protein folding. There were 3 problem set assignments and one term project for which we had to either create or adapt an algorithm and apply it to a biological problem (involved writing a paper and giving a presentation). This was the first time I had ever taken a CS course and not surprisingly I found it to be very challenging. This course involved a lot of CS theory which I was forced to learn on the fly. The assignments were difficult (not just for me) but fairly graded by the instructor. If you have a strong CS background and are looking to pick up the biology along the way you'll probably do well, otherwise this course will be challenging. = Recommended Textbooks = These textbooks are recommended for graduate level computational biology * [[http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471478784.html|Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins]], Editors: Andreas D. Baxevanis, B. F. Francis Ouellette * [[http://www.amazon.com/gp/product/0521585198/002-7081833-7683230?v=glance&n=283155|Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology]], Author: Dan Gusfield * [[http://www.amazon.com/gp/product/0521629713/ref=pd_bxgy_img_b/002-7081833-7683230?ie=UTF8|Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids]] by Richard Durbin, Sean R. Eddy, Anders Krogh, Graeme Mitchison * There is also the [[http://bioinformatics.org/faq/|Bioinformatics.org FAQ]] - see the section on books.