Computational Biology Major
Computational biology is an interdisciplinary field that specializes in the use of mathematical modeling, algorithms, and machine learning to frame complicated biological problems into less complicated computational or mathematical problems. It’s not a degree that’s typically offered to undergraduates at many universities, and success in the field requires a strong understanding of many computer science conventions, backed up by a strong intuition in experimental biology. The most exceptional people in the field of computational biology will be able to truly draw connections between both disciplines of computer and life sciences.
The computational biology degree for undergraduates is a joint degree between the School of Computing and Information and Dietrich School of Arts and Sciences, and students are able to choose which school they’d like to be in while pursuing the degree. However, it is worth noting that there are substantial differences between the degree requirements, and that computational biology at Pitt might not be what you expect.
Forewarning about the Degree
I will use this section of the article to describe some shortcomings of the degree, shape your future expectations of participating in such a degree, and how being self-driven enough can really change how competitive you can be by taking advantage of this unique degree setting.
You need to be self-driven
You cannot blindly trust the degree requirements set by the university. You WILL need to go out of your way to enroll in courses to build the foundations you’ll need to become an excellent computational biologist.
A glaring issue is the degree’s lack of math and statistics prerequisites required to understand machine learning, coursework in low level programming like C, and the administration’s choice to require introduction to data science, but not machine learning. There is clearly some kind of disconnect between majoring in computational biology, and being a specialized data analyst, and depending on what you want, you’re going to need to work hard to fill in the gaps.
This means enrolling in calculus 2-3 and matrices (MATH0230-MATH0240, MATH0280), probability theory (STAT1151 and STAT1152), taking physics 1 for engineers (PHYS0174), and many other computer science courses such as computer organization and systems, machine learning, artificial intelligence, and whatever other courses may interest you (CS0447, CS0449, CS1675, CS1571, etc). To stay competitive with students doing the same degree but at top programs like CMU, MIT, or Stanford, you need to take the same coursework and be confident in the same topics that employers would expect any computational biologist to know.
To further reinforce why you can’t necessarily blindly follow the degree, the PhD program for computational biology at Pitt and CMU is actually a joint program. Furthermore, one of the first classes you’d take upon entering the PhD program is an Advanced Introduction to Machine Learning at CMU (10-715), and the prerequisites for that class is discrete math, probability theory, matrices, and calculus 3. If you don’t even take these courses during your undergraduate, how could you even try and get admitted/succeed into a program whose first semester course requires it?
Dietrich or SCI?
I believe that it wouldn’t be an understatement to say that it’d be much harder to be as strong of a computational biologist if you complete the degree in Dietrich compared to SCI, purely because there’s a requirement to take 11 (ELEVEN) general education courses, which is much less flexible than SCI who allows overlaps and less requirements for general education.
SCI offers a degree that's naturally more flexible due to inherently not being an arts school, so you can spend your credits actually doing STEM courses that you need to be successful, and extra courses that you might find super interesting (I took some extra computer science courses I knew I'd like, and German 0101 was pretty interesting).
I’d also encourage doing some general education courses at an asynchronous community college over the summer for as much as you can, to offload your semesters. Dietrich is more flexible about this when you’re over 90 credits compared to SCI.
Am I cooked because I’m not at a t20?
Definitely not, but there’s no denying that you’re going to need to work harder than someone coasting through at a higher ranked school. You can definitely make your own opportunities. More importantly though, learning to be happy is the most valuable thing I’ve picked up at college, and comparison can ruin you, lock in and make yourself into who you want to be at a reasonable pace. It’s much more important to play the long game, and you’ll definitely regret trying to do too much at once.
You won't do everything perfectly, and you're going to look back and wish you did at least one thing differently. However, if you're not making mistakes, you're not learning. Furthermore, if you're not happy with what you have now, you'll constantly be trying to fill a void that'll just swallow you up whole.
What am I supposed to do after undergrad?
If you want to stay in computational biology, you’re going to need your masters or doctorate, which is why playing the long game and ensuring you’re getting amazing grades in the relevant coursework, building relations for strong letters of recommendation and valuable mentorship, and being involved in extracurriculars you’re passionate about is the most important thing you’ll spend your undergrad doing.
I would not be part of this degree if you’re looking to enter the industry as soon as possible, since computational biology mostly demands experimentalism. However, this isn’t to say that there isn’t a demand for manpower from the industry. You’d just have to compete with people who have their masters and PhDs for these high profile roles, and I doubt many undergraduates stand as tall as people who have earned their license to think.
Pre-meds can be computational biologists
As the world is transforming and the current pinnacles of technology are becoming more available, it’s especially important for the next generation of physicians to be technologically adept. You can definitely be an undergraduate with aspirations to study medicine, and there’s a growing space for physician-scientists. It’s a good way to stand out, but I’d be extremely careful about what courses you take as a pre-med, as a strong GPA is imperative to success in the field, and the admission officers for med school are extremely unforgiving. Just make sure you look at what the admission requirements are, and that you take those courses.
This bit is definitely to scare pre-meds, so please read: I know a guy who got a B in organic chemistry 1, and when interviewing for med school, the interviewers asked him why he got a B in organic chemistry. Do NOT major in computational biology if you don’t think you can succeed in ALL of its disciplines. Med school is uber competitive, so tread lightly.
A fundamental problem with the technical comp bio courses
You will be required to take multiple courses in computational biology. Specifically, you’ll need to take the introduction to computational biology course (BIOSC1540), one of computational genomics (BIOSC1542) or simulation and modeling (BIOSC1544), and a capstone course which is hands-on, unsupervised research or software development (BIOSC1640, CS1640). BIOSC1540 was one of the most conceptually challenging courses I ever took, and I had to fight tooth and nail for that A-. Professor Maldonado will push you hard, and he’ll provide you with the same rigor a PhD student could expect. It was a very rewarding course. However, its successors have weak prerequisites, which leads to classes that simply can’t be as intensive or foundational. Computational genomics / simulation and modeling only need you to know python syntax(not even object oriented programming), so naturally, the ceiling for how intense the courses are might not be very high depending on when you take them. In computational genomics, there’s no programming outside basic R scripting and UNIX terminal commands, so the course is theoretical and follows wet-lab techniques. This feels even worse in the capstone course, which only requires you to know data structures (CS0445), and up to bio and gen chem 2 (BIOSC0160, CHEM0120).
My capstone project had to do with machine learning and biochemistry, so not having a background in either could really take you for a turn.
There’s obviously nothing you can do about that, but it’s worth noting.
Conclusion
If you’ve read everything above, you should be equipped with the context you need to make your own decisions about the degre and know what you’re signing up for. Let’s talk about the coursework now.
Declaration Requirements:
Computing for Scientists (CS0011)
CS 0011 or equivalent in Python: CS 0008 may be a possible alternative (offered more often and more flexible timing). You'll learn the syntax for Python and do a couple projects and labs revolving around the language. You might also get into object oriented programming and recursion, but not like you would in CMPINF0401
Foundations of Biology 1 (BIOSC0150)
This course covers biology topics such as cell structure, genetics, metabolism and photosynthesis, and foundational chemical concepts. This class is intended for natural science majors and is a fair introduction. An honors course is available.
Foundations of Biology 2 (BIOSC0160)
This course covers biology topics such as evolution, ecology, reproduction, and biotechnology. Similarly, this is somewhat more intensive. An honors course is available.
General Chemistry 1 (CHEM0110)
This covers the first half of chemistry topics from atomic theory to thermochemistry. An honors course is available. A lab is included in the course (4 hours once a week) unless it's a retake.
General Chemistry 2 (CHEM0120)
This covers the second half of chemistry topics from acid base chemistry to thermochemistry, electrochemistry, and bonding theories. An honors course is available. A lab is included in the course (4 hours once a week) unless it's a retake.
You need to earn at least a C in all of these courses to declare the computational biology major.
The Computational Biology Degree
Core CS:
Introduction to Computing for Scientists (CS0011)
All of the CS 001X courses will introduce students to the concepts of computing and computer programming. Students in these courses will learn how a computer works and how to write programs in order to use the computer as a problem solving tool. A major focus of the class will be on developing problem-solving skills (e.g., how to decompose a problem into more manageable parts and how to combine those parts into an overall solution). CS 0011 in particular will focus on problems related to the natural sciences with an emphasis on computational biology. Domain-specific projects and labs will be assigned throughout the course to encourage students in the natural sciences to apply computing to their field of study.
You'll basically be learning Python syntax, and briefly cover object oriented programming and recursion, but nothing in depth like you would do in 401. The projects were actually pretty interesting to me at the time, so the course can definitely be a good way to learn the language and how to think in Python.
Intermediate Programming (CMPINF0401)
This is an intermediate programming course that focuses on programming via an object-oriented paradigm. Students entering CMPINF 0401 are expected to have some previous concepts and then focus on object-oriented programming, including classes, encapsulation and abstraction, inheritance, polymorphism and interfaces. Some introductory data structures and algorithms will also be covered in this course. This class is a programming-intensive course, and students will be expected to complete several non-trivial programming projects throughout the term.
Discrete Structures for Computer Science (CS0441)
The purpose of this course is to understand and use (abstract) discrete structures that are backbones of computer science. In particular, this class is meant to introduce logic, proofs, sets, relations, functions, counting, and probability, with an emphasis on applications in computer science.
Data Structures (CS0445)
This course emphasizes the study of the basic data structures of computer science (stacks, queues, trees, lists) and their implementations using the java language. Included in this study are programming techniques that use recursion, reference variables, and dynamic memory allocation. Students in this course are also introduced to various searching and sorting methods and also expected to develop an intuitive understanding of the complexity of these algorithms.
Algorithms and Data Structures (CS1501)
As the second in a two-course sequence on algorithms and data structures, this course covers a broad range of the most commonly used algorithms. Some examples include algorithms for searching, encryption, compression, graphs, and dynamic programming. The students will implement and test several algorithms. The course is programming intensive.
This is probably the toughest course in the computer science core, and you'll be forced to think a lot about what you're doing and why. It's recommended to complete CS0445 and CS1501 as soon as possible.
Introduction to Data Science (CS1656)
This course aims to expose students to different data management, data manipulation, and data analysis techniques. The class will cover all the major data management paradigms (relational/SQL, XML/Xquery, RDF/SPARQL) including NOSQL and data stream processing approaches. Going beyond traditional data management techniques, the class will expose students to information retrieval, data mining, data warehousing, network analysis, and other data analysis topics. Time permitting, the class will include big data processing techniques, such as the map/reduce framework.
Core Biological Sciences:
Genetics (BIOSC0350)
This is a somewhat advanced level genetics course that goes deeper into foundational concepts covered in BIOSC 150 and 160. It is definitely a step up compared to the foundational biology courses, and important to establish your foundation in pursuing higher level classes and trying to determine applications using technology and math.
Biochemistry (BIOSC1000)
This is the hardest science course of the major. Definitely take care to make sure that you can put the required effort into this class to succeed. It's extremely fast paced, combines advanced biology concepts mostly studied in BIOSC 150 with chemistry concepts from 110, 120, and organic concepts from 310. However, I would mention that it is much more focused on biology than chemistry. Also, this course is extremely important for studying physiology, systems biology, and metabolism.
- Students may alternately choose BIOSC 1810 MACROMOLECULAR STRUCTURE AND FUNCTION and BIOSC 1820 METABOLIC PATHWAYS AND REGULATION in lieu of BIOSC 1000 . In this case, BIOSC 1820 becomes the elective course.
Computational Biology Core
Computational Biology (BIOSC1540)
Computational Biology (BIOSC1540) is the first course a student will take in the computational biology core. It functions as a fairly rigorous introduction to topics in bioinformatics and computational biology. The class changes a bit by semester to try and optimize the experience for students, and it’s a very rewarding course to get through. You’ll learn about computational decision making, design, and tools that scientists around the world use everyday and how they were made.
BIOSC1540 is probably the most conceptually challenging class I’ve ever taken. I was forced to give perfect answers to homework problems, determine the best correct answer out of multiple correct options on exams, and do a project to design my own protein. This was definitely the course that solidified my interest for the field.
Computational Genomics (BIOSC1542) / Simulation and Modeling (BIOSC1544)
Computational genomics is offered every even year spring semester, and simulation and modeling is offered every odd year spring semester. I would try to take both courses, and view each as an opportunity to learn about a specific part of a larger field to see what you might like more. The prerequisites for the courses only expect you to be familiar with Python syntax at most, and I can personally attest that computational genomics requires virtually no previous programming experience, except for a couple homework assignments where you need to use R.
I think the value of taking these courses is that you’ll undoubtedly get exposure to simulation pipelines, and get to see what goes on inside the mind of someone who would be using computational genomics tools to find an answer to a question that can only be revealed from mass amounts of data you can transform. It would certainly be a give you a leg up on other students who haven’t or can’t take the course, and you can really amplify your statement of purpose by already having experience and expectations.
Comp Bio Capstone (BIOSC1640) / Comp Bio Capstone (CS1640)
This is a capstone where you’ll be given a genuine research project from a lab at Pitt. You’ll be mostly unsupervised, and work autonomously with a team to try and find an answer to your research question. It’s mostly the culmination of your entire degree, but once again, the prerequisites are not insane, so the research can only be so complicated, and much of the reframing of the biological problem is done by your professor. You’re simply expected to execute the reframing that was done, and to carry out the computation.
The projects are usually secretive and under mutual nondisclosure. They’re really good projects where you can learn a lot, work with others, and walk away as a better computational biologist. It's a real experience at the end of the day, and you might get a publication out of it.
The biology department version of the course is intended as research, so it’s more like reproducible scripting and less like end to end software development. The computer science department version expects its students to work more on the end to end software development of bioinformatics tools.
Comp Bio Senior Seminar (BIOSC1630)
Includes (W) writing requirement for the major. This course focuses on reading and analyzing primary research literature in the field of computational biology, with an emphasis on effective techniques for communicating about the associated science both verbally but also in writing. Many articles are to be read and critiqued and discussed.
Co-requisites:
- MATH 0220: Calculus 1
- STAT 1000: Applied Statistics
- CHEM 110 & 120 (required for declaration too), 310 (organic 1)
- CHEM 0310: Organic Chemistry 1 is a notoriously difficult course which requires a deep understanding of chemistry fundamentals and the ability to visualize 3D structures of molecules in certain conditions, environments, and perspectives. Topics covered: nomenclature, stereochemistry, radical chemistry, fundamental reaction mechanisms, alcohols, sulfur chemistry, alkenes and alkynes. It may optionally include lab spectroscopy techniques.
Electives
- These are too numerous to list here. Visit the program catalog for the comprehensive list here. Lots of overlap with pre-professional course requirements.
Extra notes
- The major is quite broad and there are a lot of different information covered. If you have room in your schedule, you could look into other departments that offer courses you might like (ex: Linguistics, Chemistry, Neuroscience, Math, Stats, Computer Science etc.) to get more specialized bioinformatics / computational biology knowledge.
- Satisfactory/No Credit information: "One core course required for the major may be taken on an S/NC basis. Co-requisite courses may be taken on an S/NC basis subject to School limitations. Please check with your School for specific information on S/NC grades."
- If you’re in Dietrich and try to get some overlap with SCI for a double major or a minor, do note that overlap is limited, so make sure to discuss with your advisor about how overlap can work out for you.
- Always make sure to discuss with your advisor and do your own research to know what's best for you. CSC also has a lot of people you can reach out to for help!
Some Recommended Classes
You know yourself better than anyone
Aside from foundational courses such as physics, math, statistics, and certain computer science courses. I don’t think it’s my place to recommend that you should double major in computer science / data science, or tell you to get a math / stats minor. I've observed the degree to have certain flaws in the specific preparation of some higher level courses, and that it sometimes fails to specifically require certain higher level courses.
I am only going to list courses that I would specifically add to the degree, just because I don't list a course here doesn't mean that you shouldn't take it! If you want to take diffeq (MATH0290), take it. If you want to take stochastic processes (STAT1731), take it. You're the one doing the degree after all.
I personally really wanted to take courses about operating systems, computer networking, compilers, and cloud computing, so because of that, I just happened to double major in computer science. I also just had more expectations out of classes like computational genomics to be a course with programming assignments. It definitely doesn’t mean my word is law. So in these recommendations, I will only ever list out some continuations to the cores in the degree that already exist, and some parts of the cores that I wish existed. Spend time learning and doing what you think you might like, and then spend the rest of your college career learning about the stuff you’ve discovered you enjoy. Remember, learning to be happy is more important than anything else.
Recommended Extra Classes
Computer Science
I’d recommend taking Computer Organization (CS0447), Systems Software (CS0449), Introduction to Machine Learning (CS1675), Algorithm Design (CS1510), and the other courses about natural language processing, computer vision, and deep learning depending on your interests.
Courses I’d say should 100% be on your radar are specifically 447, 449, 1510, and 1675.
447 and 449
Every computational biologist should be familiar with low level programming languages, even if you’d use them way less compared to other languages like Python. A lot of Python libraries are written in C, and you’ll definitely be expected to know about the libraries you’re using when working on something. 447 and 449 are the appropriate prerequisites for learning about lower level languages as they teach you a bit about how computers work under the hood, and then have assignments based around using C and its debugger.
To further reinforce my point, I'd encourage you to browse around some job postings at comp bio firms such as Recursion, DESRES, Schrödinger, and others to see what they expect their engineers to contribute to, and how. This is because software and other algorithms are usually typically prototyped in languages like Python that do a lot of heavy lifting under the hood. Once it's time to distribute the software, it's usually then rewritten in a performance oriented language such as C++.
C++ is probably the most employable language right now in all applications of technology and across a variety of companies, so the best thing you can do for yourself is reinforce yourself with coursework using a high performance language and understand its conventions.
1510
This course is hard, but I think I’d be doing you a disservice to not recommend it. Thinking is a very big part of any job, and this class will definitely help you learn to think. Dynamic programming, greedy algorithms, and parallel algorithms also have many applications in computational biology, and knowing about these algorithms will make you a much, much better scientist.
Note that this means you'll also need to enroll in CS1502, so it's definitely a commitment, although a worthwhile one if you decide to go through.
1675
The bread and butter of computational biology is the use and development of machine learning algorithms, and you should definitely make sure you’re well equipped to understand the intricacies of your machine learning algorithms so you can make the best choice about what, when, and why you'd use a specific approach. This course requires you to be familiar with probability, and knowing about some topics in calculus would be helpful. This could also be the gateway to help you decide with other AI courses you might want to know about, as deep learning, computer vision, and NLP have many, many applications in our world today, and especially so in comp bio.
Biology, Chemistry, Physics
Pitt has a ton of biology and chemistry courses, and it’s definitely one of the universities strong suits. Cell Biology (BIOSC1500) wouldn’t be a bad option, and there are a ton of other courses like synthetic biology, cancer biology, computational chemistry, and more that you could want to take if you’re interested.
For physics, I only took (calc-based) physics 1 just to be in the know about some basic concepts when doing calculations or when it came up in conversation during BIOSC1540. I’d definitely recommend it, and I know Professor Maldonado(teaches 1540) would definitely recommend you take stat mech if you’re interested in modeling.
Math and Statistics
Don’t discount math and statistics, remember, this is a degree that requires you to be just as good at math as you would be in computer science and biology.
Math
I would definitely take the entire calculus series up until calc 3. Calc 2 is needed to understand some concepts in probability theory, but more importantly, calc 3 has concepts that are required to even understand what’s going on in machine learning. Machine learning and math go together like PB&J. Calc 2 was also my favorite class ever, but apparently that’s a hot take. Calc 3 is a super important class, so take it!
The course codes for calculus are MATH0220, 0230, and 0240.
A good portion of computational biology is about simulations which are basically just a ton of calculations, so you’re going to want to get the processes of doing calculations to become second nature.
Statistics
Statistics and probability is likely even more foundational to being knowledge about many biological systems, and (once again) machine learning! Probability theory (STAT1151) and mathematical statistics (STAT1152) are going to teach you topics you’d carry for your entire career.
Probability theory has become indispensable in computer science, it's at the core of AI and ML which require decision making under uncertainty, it's integral to cs theory where randomization and probabilistic analysis is the basis of many algorithms, and it's needed to model the performance of systems and networks.
Probability is also just as essential to biology, since you'll be finding yourself predicting random events, managing complex systems, and understanding molecular behavior. Genetics is literally just introductory probability theory.
This also extends to quantum states, reaction liklihood, and molecular motion in chemistry and physics.
Miscellaneous
It’s also worth noting that you can cross register for courses at CMU. They have a lot more variety in computational biology, so make sure to check them out for something interesting that Pitt might not offer. I’ve always wanted to try and take their string algorithms course (02-414) or epigenetics class (02-319).
Scheduling
There’s a ton of different variations based on your circumstances, and I’d encourage reaching out to an officer or someone in CSC to help you get feedback about your schedules. The bit/byte program is great for this, and people in the club are still available for advice year round, regardless of your relation to them.
You will definitely have to take a long look in the mirror and think about the kind of person you are. Can you handle doing 2, 3, or 4 STEM classes a semester? Can you handle the most intense courses in SCI and Dietrich? Are you willing to take the time to think of your own schedule, and to be patient to discover what you like?
It’s worth the time, and you’ll thank yourself in the long run. The professors in the program, and students in the club are your friends, and they’re going to want you to succeed and be happy. Don’t go through things alone for no reason.
Final Statement
I realize that this was a pretty brutal description of the degree, and I want to ensure that I'm not trying to fearmonger. I love this field, and I would definitely major in computational biology again. I’ve become friends with exceptional people, set my sights on incredible goals, and the most astounding thing is that I managed to achieve some of them!
The industry is becoming more competitive, and I want you to be ready for it.
My life as a computational biologist is definitely better than it would have been otherwise, and if I had the chance to start college all over again, I’d pick this major again.
The most important thing is that you also feel the same way about this degree, so I felt that I needed to be kind of nuclear with how I described things or voiced my opinions. Computational biology is a paradigm that changed my life, and I’m excited to devote much more of my time to learning about all the intricacies of this field.
Share this guide
Help others discover this resource