MS in Data Science
所属信息
基本信息
项目时长
2项目学分
32学费估算
$24,489申请截止日期
-
秋季
-
早申请
5月1日
-
其他
-
春季
10月15日 -
国际生
5.1
申请信息
托福/GRE Code
3665申请费
75-100GPA要求
3.00
TOEFL要求
100.00
Prerequisite
Placement Exams Each incoming masters student, regardless of his or her background, takes two placement exams administered one week prior to the beginning of the semester. The two exams cover fundamentals of computer science and programming skills and basic statistics, probability, and linear algebra. If the student does not get a B or above in a part of the placement exam, then the student must take the corresponding introductory course.
Introduction to Programming for Data Science (DS 5010) The introductory course on fundamentals of programming and data structures covers data structures (lists, arrays, trees, hash tables, etc.), program design, programming practices, testing, debugging, maintainability, data collection techniques, and data cleaning and preprocessing. This course will have a class project where the students will use the concepts they learn to collect data from the web, clean, and preprocess and ready for analysis.
Introduction to Linear Algebra and Probability for Data Science (DS 5020) The introductory course on basics of statistics, probability, and linear algebra covers random variables, frequency distributions, measures of central tendency, measures of dispersion, moments of a distribution, discrete and continuous probability distributions, chain rule, Bayes' rule, correlation theory, basic sampling, matrix operations, trace of a matrix, norms, linear independence and ranks, inverse of a matrix, orthogonal matrices, range and null space of a matrix, the determinant of a matrix, positive semidefinite matrices, eigenvalues and eigenvectors.
Introduction to Programming for Data Science (DS 5010) The introductory course on fundamentals of programming and data structures covers data structures (lists, arrays, trees, hash tables, etc.), program design, programming practices, testing, debugging, maintainability, data collection techniques, and data cleaning and preprocessing. This course will have a class project where the students will use the concepts they learn to collect data from the web, clean, and preprocess and ready for analysis.
Introduction to Linear Algebra and Probability for Data Science (DS 5020) The introductory course on basics of statistics, probability, and linear algebra covers random variables, frequency distributions, measures of central tendency, measures of dispersion, moments of a distribution, discrete and continuous probability distributions, chain rule, Bayes' rule, correlation theory, basic sampling, matrix operations, trace of a matrix, norms, linear independence and ranks, inverse of a matrix, orthogonal matrices, range and null space of a matrix, the determinant of a matrix, positive semidefinite matrices, eigenvalues and eigenvectors.
申请材料
Resume +PS+ Transcript+3 Recommendations +GRE+TOEFL/IELTS
项目介绍
The Master of Science in Data Science curriculum requires five core courses that jointly represent the essential technical skills in data science. Two courses in algorithms and data processing examine foundational concepts and languages, focusing on data representation, storage, manipulation, and query, as well as large-scale computing and optimization. Two core courses in machine learning and data mining introduce concepts on data modeling, representation, uncovering associations, and making predictions. The capstone course presents a holistic view of data science. Through experiential learning, students are exposed to the real-world challenges of implementing data science techniques to solve meaningful problems and effectively communicate with data. The courses are tailored toward technically or mathematically trained students.
The five core courses include:
Two core courses in algorithms and data processing
Two core courses in machine learning and data mining
One core course in information visualization
Three elective courses are drawn from a selection of courses across Northeastern.
Learning Outcomes
Students who complete the MS degree will be able to:
Collect data from numerous sources (databases, files, XML, JSON, CSV, and Web APIs) and integrate them into a form in which the data is fit for analysis
Use R and Python to explore data, produce summary statistics, perform statistical analyses; use standard data mining and machine-learning models for effective analysis
Select, plan, and implement storage, search, and retrieval components of large-scale structure and unstructured repositories
Retrieve data for analysis, which requires knowledge of standard retrieval mechanisms such as SQL and XPath, but also retrieval of unstructured information such as text, image, and a variety of alternate formats
Match the methodological principles and limitations of machine learning and data mining methods to specific applied problems and communicate the applicability and the advantages/disadvantages of the methods in the specific problem to nondata experts
Carry out the full data analysis workflow, including unsupervised class discovery, supervised class comparison, and supervised class prediction; Summarize, interpret, and communicate the analysis of results
Organize visualization of data for analysis, understanding, and communication; choose appropriate visualization method for a given data type using effective design and human perception principle
Develop methods for modeling, analyzing, and reasoning about data arising in one or more application domains such as social science, health informatics, web and social media, climate informatics, urban informatics, geographical information systems, business analytics, bioinformatics, complex networks, public health, and game design
Manage, process, analyze, and visualize data at scale. This outcome allows students to handle data where the conventional information technology fail.