CHRIST (Deemed to University), Bangalore

DEPARTMENT OF statistics

sciences

Syllabus for
Master of Science (Statistics)
Academic Year  (2020)

 
1 Semester - 2020 - Batch
Course Code
Course
Type
Hours Per
Week
Credits
Marks
MST131 PROBABILITY THEORY Core Courses 5 5 100
MST132 DISTRIBUTION THEORY Core Courses 5 5 100
MST133 MATRIX THEORY AND LINEAR MODELS Core Courses 5 5 100
MST134 RESEARCH METHODOLOGY AND LATEX Core Courses 2 2 50
MST171 SAMPLE SURVEY DESIGNS Core Courses 6 5 150
MST172 STATISTICAL COMPUTING USING R Core Courses 4 3 100
2 Semester - 2020 - Batch
Course Code
Course
Type
Hours Per
Week
Credits
Marks
MST231 STATISTICAL INFERENCE-I Core Courses 4 4 100
MST232 STOCHASTIC PROCESSES Core Courses 4 4 100
MST233 CATEGORICAL DATA ANALYSIS Core Courses 4 4 100
MST271 REGRESSION ANALYSIS Core Courses 6 5 150
MST272 STATISTICAL COMPUTING USING PYTHON Core Courses 4 3 100
MST273A PRINCIPLES OF DATA SCIENCE AND DATA BASE TECHNIQUES Discipline Specific Elective Courses 5 4 150
MST273B SURVIVAL ANALYSIS Discipline Specific Elective Courses 5 4 150
MST273C STATISTICAL QUALITY CONTROL Discipline Specific Elective Courses 5 4 150
MST281 RESEARCH MODELING AND IMPLEMENTATION Core Courses 2 1 50
    

    

Introduction to Program:
Master of Science in Statistics at CHRIST (Deemed to be University) offers the students an amalgam of knowledge on theoretical and applied concepts of Statistics in a wider spectrum. Further, it intends to impart awareness on the importance of the conceptual framework of statistics across diversified fields and to provide practical training on the applications of statistical methods for carrying out analysis of data using sophisticated programming languages and statistical software such as R, Python, SPSS, EXCEL etc. The curriculum of the programme has been designed in such a way to cater the needs of stakeholders to get placements in industries and institutions on successful completion of the course and to provide those ample skills and opportunities to meet the challenges at the national level competitive examinations like CSIR NET in Mathematical Science, Indian Statistical Service (ISS), RBI research officer, etc.

Programme Outcome/Programme Learning Goals/Programme Learning Outcome:

PO1: Engage in continuous reflective learning in the context of technology and scientific advancement.

PO2: Identify the need and scope of Interdisciplinary research.

PO3: Enhance research culture and uphold scientific integrity and objectivity

PO4: Understand the professional, ethical and social responsibilities

PO5: Understand the importance and the judicious use of technology for the sustainability of the environment

PO6: Enhance disciplinary competency, employability and leadership skills

Programme Specific Outcome:

PSO1: Demonstrate analytical and problem-solving skills to identify and apply appropriate principles and methodologies of statistics in real-time problems.

PSO2: Demonstrate the execution of statistical experiments or investigations, analyse and interpret using appropriate statistical methods, including statistical software and report the findings of experiments or studies accurately.

PSO3: Demonstrate acquaintance with contemporary trends in industrial/research settings and innovate novel solutions to existing problems.

PSO4: Demonstrate competency as a statistician in order to succeed in a broad range of analytic, scientific, government, financial, health, technical and other fields

Assesment Pattern

CIA - 50%

ESE - 50%

Examination And Assesments

CIA - 50%

ESE - 50%

MST131 - PROBABILITY THEORY (2020 Batch)

Total Teaching Hours for Semester:75
No of Lecture Hours/Week:5
Max Marks:100
Credits:5

Course Objectives/Course Description

 

To make students to use measure-theoretic and analytical techniques for understanding probability concepts. 

Course Outcome

CO1: Understand measure and measurable functions

CO2: Analyse probability concepts using measure-theoretic approach

CO3: Identify applications of different limit theorems in statistical problems

CO4: Apply Radon-Nikodym theorem in conditional probability 

Unit-1
Teaching Hours:15
Probability and Random variable
 

Algebra of sets, Fields, Sigma fields, Inverse function, Measurable functions, Random variables, Lebesgue measure, Lebesgue-Stieltjes measure, Counting measure, Discrete probability space, General probability space as normed measure space, Induced probability space. Distribution function of a random variable, Distribution function of random vectors. Indepence of random variables

Unit-2
Teaching Hours:15
Expectation and Generating functions
 

Intgegration with respect to measure (Introduction only), Expectation and moments: Definition and properties, Moment generating functions, Moment inequalities:Chebychev’s, Holder, Jenson and basic inequalities, Product spaces and Fubini’s theorem, Charecteristic function and properties (idea and statement only).

Unit-3
Teaching Hours:15
Convergence
 

Modes of convergence: Convergence in probability, in distribution, in rth mean, almost sure convergence and their inter-relationships, Convergence theorem for expectation such as Monotone convergence theorem, Fatou’s lemma, Dominated convergence theorem.

Unit-4
Teaching Hours:15
Limit Theorems
 

Law of large numbers, Covergence of series of independent random variables, Kolmogorov’s inequality, Weak law of large numbers (Kninchine’s and Kolmogorov’s), Kolmogorov’s strong law of large numbers, Central limit theorems for i.i.d random variables, Lindberg-Levy and Liaponov’s CLT, Lindberg-Feller CLT.

Unit-5
Teaching Hours:15
Conditioning
 

Conditional expectation and its properties, Conditional probabilities, Randon-Nikodym Theorem (Statement only) and its applications, Bayes’ theorem, Martingales, Submartingales, Martingale convergence theorem, Decomposition of submaritingales.

Text Books And Reference Books:

1. Billingsley P (2012), Probability and Measure, Anniversary Ed., John Wiley.

2. Bhat, B.R, (2014), Modern Probability Theory, 4 th Ed., New Age International.

3. Rohatgi, V.K. and Salah, A.K.E, (2014), An Introduction to Probability and Statistics, 3 rd Ed., John Wiley & Sons. 

Essential Reading / Recommended Reading

1. Feller W, (2008), An Introduction to Probability Theory and its Applications, Volume I , 3 rd Ed., Wiley Eastern.

2. Feller W, (2008), An Introduction to Probability Theory and its Applications, Volume II,3rd Ed., Wiley Eastern.

3. Basu A.K, (2012), Measure Theory and Probability, 2 nd Ed., PHI.

4. Durrett R, (2010), Probability: Theory and Examples. 4th ed. Cambridge University Press, 2010.

Evaluation Pattern

Component

Marks

CIA I

10

Mid Semester Examination (CIA II)

25

CIA III

10

Attendance

05

End Semester Exam

50

Total

100

MST132 - DISTRIBUTION THEORY (2020 Batch)

Total Teaching Hours for Semester:75
No of Lecture Hours/Week:5
Max Marks:100
Credits:5

Course Objectives/Course Description

 

To make students to understand different probability distributions and to model real-life problems using it.

Course Outcome

CO1: To understand different families of probability distributions.

CO2: Analyse well-known probability distributions as special case of different families of distribution.

CO3: To identify different distributions arising from sampling from normal distribution.

CO4: To apply probability distribution in various statistical problems.

Unit-1
Teaching Hours:15
Discrete Distributions
 

Modified power series family and properties. Binomial, Negative binomial, Logarithmic series and Lagrangian distributions and their properties as special cases of the results from modified power series family, hypergeometric distribution and its properties. 

Unit-2
Teaching Hours:15
Continuous Distributions
 

Pearsonian system of distributions, Beta, Gamma, Pareto and Normal as special cases of the Pearson family and their properties. Exponential family of distributions. 

Unit-3
Teaching Hours:15
Sampling distributions
 

Sampling distributions of the mean and variance from normal population, independence of mean and variance, Chi-square, students t and F distribution and their non-central forms. Order statistics and their distributions.

Unit-4
Teaching Hours:15
Multivariate distributions
 

Bivariate Poisson, Multinomial distribution, Multivariate normal (definition only), bivariate exponential distribution of Gumbel, Marshall and Olkin and Block and Basu, Dirichlet distribution.

Unit-5
Teaching Hours:15
Quadratic forms
 

Quadratic forms in normal variables: distribution and properties, Cochran’ theorem: applications.

Text Books And Reference Books:

1. Rohatgi, V.K. and Salah, A.K.E. (2015) An Introduction to Probability and Statistics, 3 rd Ed., John Wiley & Sons.

2. Arnold B.C, Balakrishnan N and Nagaraja H.N (2012) A first course in order statistics.

3. Galambos J, and Kotz’s (1978): Characterization of Probability distributions, Springer - Verlag.

4. Elderton, W. P., & Johnson, N. L. (2009). Systems of frequency curves, Cambridge University press. 

Essential Reading / Recommended Reading

1. Johnson N.L, Kotz S and Kemp A.W (2005) Univariate discrete distributions, 3 rd Ed., John Wiley.

2. Johnson N.L, Kotz S and Balakrishnan N (2017) Continuous univariate distributions I & II, John Wiley.

3. Johnson N.L, Kotz S and Balakrishnan N (2000) Multivariate Distribution, 2 nd Ed., John Wiley. 

Evaluation Pattern

Component

Marks

CIA I

10

Mid Semester Examination (CIA II)

25

CIA III

10

Attendance

05

End Semester Exam

50

Total

100

MST133 - MATRIX THEORY AND LINEAR MODELS (2020 Batch)

Total Teaching Hours for Semester:75
No of Lecture Hours/Week:5
Max Marks:100
Credits:5

Course Objectives/Course Description

 

This course is offered to make students understand the critical aspects of matrix theory and linear models which are used in different areas of statistics such as regression analysis, multivariate analysis, design of experiments and stochastic processes. 

Course Outcome

CO1: Understand vector-space and different operations on it

CO2: Analyse system of linear equations using matrix theoretic approach

CO3: Identify applications of matrix theory in statistical problems

CO4: Apply matrix theory in linear models

Unit-1
Teaching Hours:15
Vector Space
 

Vectors, Operations on vector space, subspace, nullspace and column space, Linearly independent sets, spanning set, bases, dimension, rank, change of basis.

Unit-2
Teaching Hours:15
System of linear equations
 

Matrix operations, Linear equations, row reduced and echelon forms, Homogenous system of equations, Linear dependence 

Unit-3
Teaching Hours:15
Linear transformations
 

Algebra of linear transformations, Matrix representations, rank nullity theorem, determinants, eigenvalues and eigenvectors, Cayley-Hamilton theorem, Jordan canonical forms, orthogonalisation process, orthonormal basis.

Unit-4
Teaching Hours:15
Quadratic forms and special matrices useful in statistics
 

Reduction and classification of quadratic forms, Special matrices: symmetric matrices, positive definite matrices, idempotent and projection matrices, stochastic matrices, Gramian matrices, dispersion matrices

Unit-5
Teaching Hours:15
Linear models
 

Fitting the model, ordinary least squares, estimability of parametric functions, Gauss – Markov theorem, applications: regression model, analysis of variance. 

Text Books And Reference Books:

1. David C. Lay, Steven R. Lay, Judi J. McDonald (2016) Linear algebra and its applications.

2. Gentle, J. E. (2017) Matrix algebra- Theory, Computations and Applications in Statistics. Springer texts in statistics, Springer, New York.

3. Strang, G. (2006) Linear Algebra and its Applications: Thomson Brooks. Cole, Belmont, CA, USA.

Essential Reading / Recommended Reading

1. Searle, S. R., & Khuri, A. I. (2017). Matrix algebra useful for statistics. John Wiley & Sons.

2. Rencher, A. C., & Schaalje, G. B. (2008) Linear models in statistics, 2nd Ed., John Wiley & Sons.

3. Christensen, R. (2011) Plane answers to complex questions: the theory of linear models. Springer Science & Business Media.

4. Khuri, A. I. (2003). Advanced calculus with applications in statistics, 2nd Ed., John Wiley & Sons. 

Evaluation Pattern

Component

Marks

CIA I

10

Mid Semester Examination (CIA II)

25

CIA III

10

Attendance

05

End Semester Exam

50

Total

100

MST134 - RESEARCH METHODOLOGY AND LATEX (2020 Batch)

Total Teaching Hours for Semester:30
No of Lecture Hours/Week:2
Max Marks:50
Credits:2

Course Objectives/Course Description

 

To acquint students with different methodologies in statistical research and to make them prepare scientific articles using LaTeX.

Course Outcome

CO1: To understand research problem

CO2: To identify suitable methodology for solving the research problem

CO3: To produce scientific articles using LaTeX

Unit-1
Teaching Hours:15
Fundamentals of research
 

Objectives, Motivation, Utility. Concept of theory, empiricism, deductive and inductive theory. Characteristics of scientific method , Understanding the language of research , Concept, Construct, Definition, Variable. Research Process Problem Identification & Formulation , Research Question – Investigation Question , Logic & Importance

Unit-2
Teaching Hours:15
Scientific writing
 

Principles of mathematical writing, LaTeX: writing a research paper, survey article, thesis writing, Beamer: preparing presentations

Text Books And Reference Books:

1. Nicholas J. Higham, (2018) Handbook of Writing for the Mathematical Sciences, Second Edition, SIAM.

2. L. Lamport (2014), LaTeX, a Document Preparation System, 2nd ed, Addison-Wesley.

Essential Reading / Recommended Reading

Kothari, C. R. and Garg, G. (2014). Research methodology: Methods and techniques. 3 rd Ed.,New Age International.

Evaluation Pattern

CIA - 50%

ESE - 50%

MST171 - SAMPLE SURVEY DESIGNS (2020 Batch)

Total Teaching Hours for Semester:90
No of Lecture Hours/Week:6
Max Marks:150
Credits:5

Course Objectives/Course Description

 

To impart the knowledge of different sample survey designs useful in the collection of scientific data

Course Outcome

CO1: Understand different steps in designing a sample survey.

CO2: Analyse different sample survey designs and find estimators.

CO3: Identify the use of different sample survey designs.

CO4: Apply suitable sample survey design in real-life problems.

Unit-1
Teaching Hours:18
Random sampling designs
 

Sampling vs census, simple random sampling: with (SRS) and without replacement (SRSWOR) of units, estimators of mean, total and variance, determination of sample size, sampling for proportions, Stratified sampling scheme: estimation and allocation of sample size, comparison with simple random sampling schemes. 

Lab Exercises:

1. Drawing samples with SRSWR and SRSWOR and estimation of parameters

2. Estimation of parameters using a sample of proportions

3. Drawing stratified sample and estimation of parameters

Unit-2
Teaching Hours:18
Ratio and regression estimators
 

Bias and mean square error, estimation of variance, confidence interval, comparison with mean per unit estimator, optimum property of ratio estimator, unbiased ratio type estimator, ratio estimator in stratified random sampling, Difference estimator and Regression estimator:- Difference estimator, regression estimator, comparison of regression estimator with mean per unit and ratio estimator, regression estimator in stratified random sampling.

Lab Exercises:

4. Estimation using ratio estimator

5. Estimation using regression estimator

6. Ratio estimator and regression estimator in stratified sampling

Unit-3
Teaching Hours:18
Varying probability sampling designs
 

With and without replacement sampling schemes: PPS and PPSWR schemes, Selection of samples, estimators: ordered and unordered estimators. Πps sampling schemes.

Lab Exercises:

7. Exercise on the PPS scheme

8. Exercise on the PPSWR scheme

9. Exercise on Πps sampling scheme

Unit-4
Teaching Hours:18
Advanced sampling designs
 

Systematic sampling scheme: estimation of population mean and variance, comparison of systematic sampling with SRS and stratified random sampling, circular systematic sampling, Cluster sampling: estimation of population mean, estimation of efficiency by a cluster sample, variance function, determination of optimum cluster size, Multistage sampling: estimation population total with SRS sampling at both stages, multiphase sampling (outline only), quota sampling, network sampling; Adaptive sampling: introduction and estimators under adaptive sampling. Introduction to small area estimation.

Lab Exercises:

9. Exercise on the systematic sampling scheme

10. Exercise on cluster sampling

11. Exercise on multi-stage sampling

12. Exercise on small area estimation

Unit-5
Teaching Hours:18
Errors in Sample Survey
 

Sampling and non-sampling errors, the effect of unit nonresponse in the estimate, procedures for unit nonresponse

Lab Exercises:

13. Exercise on the sensitivity of efficiency due to sampling errors

14. Procedures for non-response

Text Books And Reference Books:

1. Arnab, R. (2017). Survey sampling: Theory and Applications. Academic Press.

2. Singh, D. and Chaudharay, F.S. (2018) Theory and Analysis of Sample Survey Designs, New Age International.

Essential Reading / Recommended Reading

1. Cochran, W.G. (2007) Sampling Techniques, Third edition, John Wiley & Sons.

2. Singh, S. (2003). Advanced Sampling: Theory and Practice. Kluwer.

3. Des Raj and Chandhok, P. (2013) Sampling Theory, McGraw Hill. 4. Mukhopadhay, P (2009) Theory and methods of survey sampling, Second edition, PHI Learning Pvt Ltd., New Delhi.

5. Sampath, S. (2005) Sampling theory and methods, Alpha Science International Ltd., India.

Evaluation Pattern

CIA - 50%

ESE - 50%

MST172 - STATISTICAL COMPUTING USING R (2020 Batch)

Total Teaching Hours for Semester:60
No of Lecture Hours/Week:4
Max Marks:100
Credits:3

Course Objectives/Course Description

 

To equip students with knowledge of R programming to develop statistical models for real world problems.

Course Outcome

CO1: To demonstrate data handling using statistical tool R

CO2: To perform graphical representation of data using R

CO3: To demonstrate the usage of R for data analysis

Unit-1
Teaching Hours:12
Introduction
 

Variables, Functions, Vectors, Expressions and assignments, Logical expressions, Matrices, The workspace, R markdown.

Lab Exercises:

1. Demonstrate variables and functions in R

2. Creating vectors and matrices and associated operations in R

3. Logical and arithmetic operations in R

 

Unit-2
Teaching Hours:12
Basic Programming
 

Loops: if, for, while , Program flow , Basic debugging ,Good programming habits, Input and outputs: Input from a file , Output to a file –Plotting. 

Lab Exercises:

4. Illustration of control structures: if, else, for

5. Illustration of control structures: while, repeat, break, next and ifelse

6. Getting data in and out of the R environment: reading tables, reading CSV files, readLines(), opening url, user inputs, writing files.

 

Unit-3
Teaching Hours:12
Programming with functions
 

Functions, Optional arguments and default values, Vector-based programming using functions, Recursive programming, Debugging functions, Sophisticated data structures - Factors - Dataframes - Lists - The apply family.

Lab Exercises:

7. Creating user-defined functions and doing vector-based programming

8. Creating lists and data frames and associated operations

9. Demonstration of recursive functions, apply functions in R

10. Demonstration of factors and arrays in R

 

Unit-4
Teaching Hours:12
Graphics
 

Visualizing data, Graphical summaries of data-Bar chart, Pie chart, Histogram, Box-plot, Stem and leaf plot, Frequency table, Plotting of probability distributions and sampling distributions, PP plot, Q-Q Plot , ggplot2, lattice – 3D plots, Graphics parameters, par -Graphical augmentation.

Lab Exercises:

11. Visualization of numerical variables in R using ‘base R’, ‘ggplot2’ and ‘lattice 3D’ packages

12. Contingency tables and visualization of categorical variables using ‘base R’, ‘ggplot2’ and ‘lattice 3D’  packages

13. Construction  of probability plots and quantile plots in R

 

Unit-5
Teaching Hours:12
Simulation
 

Numerical methods- Root-finding algorithms, Simulating iid uniform samples, Congruential generators, Seeding, Simulating discrete random variables, Inversion method for continuous random variables, Rejection method, generation of normal variates: Rejection with exponential envelope, Box-Muller algorithm.

Lab Exercises:

14. Root finding algorithms for the non-linear system of equations.

15. Simulation of discrete variables in R

16. Simulation of continuous variables- inversion method, rejection method

 

Text Books And Reference Books:

1. Jones, O., Maillardet. R. and Robinson, A. (2014). Introduction to Scientific Programming and Simulation Using R. Chapman & Hall/CRC, The R Series.

2. Matloff, N. (2016). The art of R programming: A tour of statistical software design. No Starch Press.

Essential Reading / Recommended Reading

1. Crawley, M, J. (2012). The R Book, 2nd Edition. John Wiley & Sons.

2. Chambers, J. M. (2008). Software for Data Analysis-Programming with R. SpringerVerlag, New York.

Evaluation Pattern

CIA - 50%

ESE - 50%

MST231 - STATISTICAL INFERENCE-I (2020 Batch)

Total Teaching Hours for Semester:60
No of Lecture Hours/Week:4
Max Marks:100
Credits:4

Course Objectives/Course Description

 

To provide a strong mathematical and conceptual foundation in the methods of parametric estimation and their properties.

Course Outcome

CO1: To understand the properties of estimators.

CO2: To identify the suitable estimation method.

CO3: To analyse likelihood function and apply different root solving methods to find estimators

CO4: To construct confidence intervals for parameters involved in the model.

Unit-1
Teaching Hours:12
Sufficiency
 

Sufficiency: factorization theorem, minimal sufficiency, exponential family and completeness. Ancillary statistics and Basu's theorem.

Unit-2
Teaching Hours:12
Unbiasedness
 

UMVUE: Fisher Information and Cramer-Rao inequality, Chapman-Robbin’s and Bhattacharya bounds, Rao-Blackwell theorem, Lehman-Scheffe theorem. Unbiased estimation.

Unit-3
Teaching Hours:12
Consistent estimators
 

Consistency, Weak and strong consistency, Marginal and joint consistent estimators, CAN estimators, equivariance, Pitman estimators

Unit-4
Teaching Hours:12
Methods of point estimation
 

Method of moments,  Minimum chi-square and its modification, Least square estimation, Maximum likelihood, Properties of maximum likelihood estimators, Cramer-Huzurbazar Theorem, Likelihood equation - multiple roots, Iterative methods, EM Algorithm.

Unit-5
Teaching Hours:12
Interval estimation
 

Large sample confidence interval, shortest length confidence interval. Methods of finding confidence interval: Inversion of test statistic, pivotal quantities, pivoting CDF, evaluation of confidence interval: size and coverage probability, loss function and test function optimality.

Text Books And Reference Books:

1.      Kale, B. K. and Muralidharan, K.  (2015). Parametric Inference: An Introduction. Alpha Science Int. Ltd.

2.      Srivastava, A. K. , Khan, A. H. and Srivastava, N. (2014). Statistical Inference: Theory of Estimation. PHI Learning Pvt. Ltd, New Delhi.

3.      Lehmann, E. L., & Casella, G. (2006). Theory of point estimation, 2nd Ed. Springer.

4.      Robert, C., & Casella, G. (2013). Monte Carlo statistical methods. Springer.

Essential Reading / Recommended Reading

1.      Casella, G., & Berger, R. L. (2002). Statistical Inference. Pacific Grove, CA: Duxbury.

2.      Silvey, S. D. (2017). Statistical inference. Routledge.

3.      Trosset, M. W. (2009). An introduction to statistical inference and its applications with R.

Chapman and Hall/CRC.

4.      Dixit, U. J. (2016). Examples in parametric inference with R, Springer.

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST232 - STOCHASTIC PROCESSES (2020 Batch)

Total Teaching Hours for Semester:60
No of Lecture Hours/Week:4
Max Marks:100
Credits:4

Course Objectives/Course Description

 

To equip the students with theoretical and practical knowledge of stochastic models which are used in economics, life sciences, engineering etc.

Course Outcome

CO1: To understand different stochastic models.

CO2: To identify ergodic Markov chains.

CO3: To analyse queuing models using continuous-time Markov chains.

CO4: To apply Browning motion in finance problems.

Unit-1
Teaching Hours:12
Introduction
 

sequence of random variables, definition and classification of a stochastic process, autoregressive processes and stationary processes.

Unit-2
Teaching Hours:12
Discrete time Markov chains
 

Markov Chains: Definition, Examples, Transition probability matrix, Chapman-Kolmogorov equation, classification of states, limiting and stationary distributions, ergodicity, discrete renewal equation and basic limit theorem, Absorption probabilities, Criteria for recurrence. Generic application: hidden Markov models

Unit-3
Teaching Hours:12
Continuous time Markov chains and Poisson process
 

Transition probability function, Kolmogorov differential equations, Poisson process: homogenous process, interarrival distribution, compound process, Birth and death process. Service applications: Queuing models- Markovian models.

Unit-4
Teaching Hours:12
Branching process
 

Galton-Watson branching processes, Generating function, Extinction probabilities, Continuous-time branching processes, Extinction probabilities, Branching processes with general variable lifetime.

Unit-5
Teaching Hours:12
Renewal process and Brownian motion
 

Renewal equation, Renewal theorem, Applications, Generalizations and variations of renewal processes, Applications of renewal theory, Brownian motion, Introduction to Markov renewal processes.

Text Books And Reference Books:

1.      Karlin, S. and Taylor, H.M. (2014). A first course in stochastic processes. Academic Press.

2.      Cinlar, E. (2013). Introduction to stochastic processes. Courier Corporation.

3.      S. M. Ross (2014). Introduction to Probability Models. Elsevier.

Essential Reading / Recommended Reading

1.      Feller, W. (2008) An Introduction to Probability Theory and its Applications, Volume I&II, 3rd Ed., Wiley Eastern.

2.      J. Medhi (2009) Stochastic Processes, 3rd Edition, New Age International.

3.      Dobrow, R.P. (2016), Introduction to Stochastic Processes with R, Wiley Eastern.

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST233 - CATEGORICAL DATA ANALYSIS (2020 Batch)

Total Teaching Hours for Semester:60
No of Lecture Hours/Week:4
Max Marks:100
Credits:4

Course Objectives/Course Description

 

To equip the students with the theory and methods to analyse and categorical responses.

Course Outcome

CO1: To understand the categorical response. 

CO2: to identify test for contingency tables. 

CO3: To apply regression models for count data. 

CO4: To analyse contingency tables using log-linear models. 

 

Unit-1
Teaching Hours:12
Introduction
 

Categorical response data, Probability distributions for categorical data, Statistical inference for discrete data

Unit-2
Teaching Hours:12
Contingency tables
 

Probability structure for contingency tables, Comparing proportions with 2x2 tables, The odds ratio, Tests for independence, Exact inference, Extension to three-way and larger tables

Unit-3
Teaching Hours:12
Generlaized linear models
 

Components of a generalized linear model, GLM for binary and count data, Statistical inference and model checking, Fitting GLMs

Unit-4
Teaching Hours:12
Logistic regression
 

Interpreting the logistic regression model, Inference for logistic regression, Logistic regression with categorical predictors, Multiple logistic regression, Summarizing effects, Building and applying logistic regression models, Multicategory logit models

Unit-5
Teaching Hours:12
Loglinear models for contingency tables
 

Loglinear models for two-way and three-way tables,  Inference for Loglinear models, the log linear-logistic connection, Independence graphs and collapsibility, Models for matched pairs: Comparing dependent proportions,  Logistic regression for matched pairs, Comparing margins of square contingency tables, symmetry issues 

 

Text Books And Reference Books:

1.       Agresti, A. (2012). Categorical Data Analysis, 3rd Edition. New York: Wiley

2.       Agresti, A. (2010). Analysis of ordinal categorical data. John Wiley & Sons.

Essential Reading / Recommended Reading

1.      Le, C.T. (2009). Applied Categorical Data Analysis and Translational Research, 2nd Ed., John Wiley and Sons.

2.      Stokes, M. E., Davis, C. S., & Koch, G. G. (2012). Categorical data analysis using SAS. SAS Institute.

3.      Agresti, A. (2018). An introduction to categorical data analysis. John Wiley & Sons.

4.      Bilder, C. R., & Loughin, T. M. (2014). Analysis of categorical data with R. Chapman and Hall/CRC.

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST271 - REGRESSION ANALYSIS (2020 Batch)

Total Teaching Hours for Semester:90
No of Lecture Hours/Week:6
Max Marks:150
Credits:5

Course Objectives/Course Description

 

To impart the knowledge statistical model building using regression technique. 

Course Outcome

CO1: To understand and formulate simple and multiple regression models 

CO2: To identify the correct regression model for the given problem 

CO3: To apply non-linear regression in real-life problems. 

CO4: To analyse the robustness of the regression model. 

 

Unit-1
Teaching Hours:18
Linear regression model
 

Linear Regression Model: Simple and multiple, Least squares estimation, Properties of the estimators,  Maximum likelihood estimation, Estimation with linear restrictions, Hypothesis testing, confidence intervals.

Lab Exercises:

1.     Build a simple linear model and interpret the data.

2.     Construct confidence interval for simple linear model

3.     Build a multiple linear model and estimate its parameters.

4.     Construct confidence interval for multiple linear model

Unit-2
Teaching Hours:18
Model adequacy
 

Residual analysis, Departures from underlying assumptions, Effect of outliers, Collinearity, Nonconstant variance and serial correlation, Departures from normality, Diagnostics and remedies. 

Lab Exercises:

1.     Carry out residual analysis and validate the model assumptions.

2.     Construct residul plots for checking outliers and non constant variance.

Unit-3
Teaching Hours:18
Model Selection
 

selection of input variables and model selection Methods of obtaining the best fit - stepwise regression Forward selection and backward elimination

Lab Exercises:

1.     Selecting best model using step wise regression.

2.     Selecting best model using Forward and backward selection procedure.

Unit-4
Teaching Hours:18
Nonlinear regression
 

Introduction to general non-linear regression, least-squares in non-linear case, estimating the parameters of a non-linear system, reparametrisation of the model Non-linear growth models 

Lab Exercise:

1.Estimate parameters in non-linear models using least square procedure

Unit-5
Teaching Hours:18
Robust regression
 

Linear absolute deviation regression, M estimators, robust regression with rank residuals, resampling procedures for regression models methods and its properties (without proof) - Jackknife techniques and least squares approach based on M-estimators.

Lab Exercises:

1.     Illustrate resampling procedures in regression models.

2.     Build a regression model robust regression procedures.

Text Books And Reference Books:

1.Chatterjee, S., & Hadi, A. S. (2015). Regression analysis by example.5th Ed., John Wiley & Sons. 

2.Draper, N. R., & Smith, H. (1998). Applied regression analysis. John Wiley & Sons. 

3.Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear regression analysis, John Wiley & Sons. 

 

Essential Reading / Recommended Reading

1.      Seber, G. A., & Lee, A. J. (2012). Linear regression analysis, John Wiley & Sons.

2.      Keith, T. Z. (2014). Multiple regression and beyond: An introduction to multiple regression and structural equation modelling. Routledge.

3.      Fox, J. (2015). Applied regression analysis and generalized linear models. Sage Publications.

4.      Fox, J., & Weisberg, S. (2018). An R companion to applied regression. Sage publications.

 

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST272 - STATISTICAL COMPUTING USING PYTHON (2020 Batch)

Total Teaching Hours for Semester:60
No of Lecture Hours/Week:4
Max Marks:100
Credits:3

Course Objectives/Course Description

 

To equip the students with programming skill in python and to apply in data analysis.

Course Outcome

CO1: To understand python and basic syntax 

CO2: To understand  functions and data modelling 

CO3: To analyze statistical datasets and visualize it. 

 

Unit-1
Teaching Hours:20
Introduction
 

installing Python; basic syntax, interactive shell, editing, saving, and running a script, The concept of data types; variables, assignments; immutable variables; numerical types; arithmetic operators and expressions; comments in the program; understanding error messages; Conditions, boolean logic, logical operators; ranges; Control statements: if-else, loops

Unit-2
Teaching Hours:20
Design with functions
 

hiding redundancy, complexity; arguments and return values; formal vs actual arguments, named arguments. Program structure and design. Recursive functions. Classes and OOP: classes, objects, attributes and methods; defining classes; design with classes, data modelling

Unit-3
Teaching Hours:20
Statistical tools
 

Pandas, Statsmodels, Seaborn, displaying statistical data, distributions and hypothesis testing, linear regression models. 

Text Books And Reference Books:

1.      Lambert, K. A. (2018). Fundamentals of Python: first programs. Cengage Learning.

2.      Haslwanter, T. (2016). An Introduction to Statistics with Python. Springer International Publishing:.

Essential Reading / Recommended Reading

1.Unpingco, J. (2016). Python for probability, statistics, and machine learning (Vol. 1), Springer International Publishing. 

2.Anthony, F. (2015). Mastering pandas. Packt Publishing Ltd. 

 

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST273A - PRINCIPLES OF DATA SCIENCE AND DATA BASE TECHNIQUES (2020 Batch)

Total Teaching Hours for Semester:75
No of Lecture Hours/Week:5
Max Marks:150
Credits:4

Course Objectives/Course Description

 

To provide a strong foundation for data science and application area related to it and understand the underlying core concepts and emerging technologies in data science.

Course Outcome

CO1: Explore the fundamental concepts of data science 

CO2: Understand data analysis techniques for applications handling large data 

CO3: Demonstrate various databases and Compose effective queries 

 

Unit-1
Teaching Hours:15
Introduction to Data Science
 

 

Introduction – Big Data and Data Science  – Data science Hype – Getting Past the Hype – The Current Landscape – Role of Data Scientist – Exploratory Data Analysis –  Data Science Process Overview – Defining goals – Retrieving data – Data preparation – Data exploration – Data modeling – Presentation. Problems in handling large data – General techniques for handling large data – Big Data and its importance, Four Vs, Drivers for Big data, Big data analytics, Big data applications, Algorithms using map reduce, Matrix-Vector Multiplication by Map Reduce. Steps in big data – Distributing data storage and processing with Frameworks – Data science ethics – valuing different aspects of privacy – The five C’s of data.

 

1.      1. Lab exercise for feature engineering

 

2.     2. Lab exercise for big data processing

 

 

Unit-2
Teaching Hours:15
Machine Learning
 

Machine learning – Modeling Process – Training model – Validating model – predicting new observations – Supervised learning algorithms – Unsupervised learning algorithms. Introduction to deep learning – Deep Feed Forward networks – Regularization – Optimization of deep learning – Convolutional networks – Recurrent and recursive nets – applications of deep learning.

1.     1.  Lab exercise on Linear and Logistic discrimination

2.      2. Lab exercise on K means clustering and Hierarchical clustering

Unit-3
Teaching Hours:15
Introduction to Relational Database and Design
 

Concept and Overview of DBMS, Data Models, Database Languages, Database Administrator, Database Users, Three Schema architecture of DBMS. Basic concepts, Design Issues, Mapping Constraints, Keys, Entity-Relationship Diagram, Weak Entity Sets, Functional Dependency, Different anomalies in designing a Database, Normalization: using functional dependencies, 1NF, 2NF, 3NF and Boyce-Codd Normal Form

1. Lab Exercise on Database Design

Top-Down Approach

Bottom-up Approach 

Unit-4
Teaching Hours:15
Database Querying and Data Integration
 

SQL Basic Structure - DDL, DML, DCL-Integrity Constraints - Domain Constraints, Entity Constraints - Referential Integrity Constraints, Concept of Set operations, Joins, Aggregate Functions, Null Values, , assertions, views, Nested Sub queries – procedural extensions – stored procedures – functions- cursors – Intelligent databases – ECA rule – Data Integration – ETL Process

 

1.         Lab Exercise on SQL

2.         Lab Exercise on PL/SQL

3.         Lab Exercise on ETL

Unit-5
Teaching Hours:15
Introduction to Data Warehouse
 

Data Warehousing - Defining Feature – Data warehouses and data marts –Metadata in the data warehouse – Data design and Data preparation - Dimensional Modeling - Principles of dimensional modeling – The star schema – star schema keys – Advantages of the star schema – Updates to the dimension tables – The snowflake schema – Aggregate fact tables – Families Oo Stars – MDX queries – Reporting services.

1. Lab Exercise on Analysis Services

2. Lab Exercise on Reporting Services

Text Books And Reference Books:

1. Davy Cielen, Arno D. B. Meysman, Mohamed Ali (2016), Introducing Data Science, Manning Publications Co.

2. Thomas Cannolly and Carolyn Begg, (2007), Database Systems, A Practical Approach to Design, Implementation and Management”, 3rd Edition, Pearson Education. 

 

Essential Reading / Recommended Reading

1. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani (2013), An Introduction to Statistical Learning: with Applications in R, Springer.

2. D J Patil, Hilary Mason, Mike Loukides, (2018), Ethics and Data Science, O’ Reilly.

3. LiorRokach and OdedMaimon, (2010), Data Mining and Knowledge Discovery Handbook. 

 

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST273B - SURVIVAL ANALYSIS (2020 Batch)

Total Teaching Hours for Semester:75
No of Lecture Hours/Week:5
Max Marks:150
Credits:4

Course Objectives/Course Description

 

This course will provide an introduction to the principles and methods for the analysis of time-to-event data. This type of data occurs extensively in both observational and experimental biomedical and public health studies. 

Course Outcome

CO1: Explore the fundamental concepts of Survival Models 

CO2: Understand Non-Parametric Survival techniques for applications lifetime data 

CO3: Demonstrate various Competing Risks and their effects 

 

Unit-1
Teaching Hours:15
Basic quantities and censoring
 

The hazard and survival functions. Mean residual life function, competing risk, right, left and interval censoring, truncation, likelihood for censored and truncated data. Parametric and non-parametric estimation in truncated and censored cases.

Lab Exercises:

1.Lab exercise on the parametric estimation of left and right-censored data

2.Lab exercise on the parametric estimation of truncated data

3.Lab exercise on the non-parametric estimation of censored and truncated data

Unit-2
Teaching Hours:15
Parametric Survival Models
 

Parametric forms and the distribution of log time. The exponential, Weibull, Gompertz, Gamma, Generalized Gamma, Coale-McNeil, and generalized F distributions. The U.S. life table. Approaches to modelling the effects of covariates. Parametric families. Proportional hazards models (PH). Accelerated failure time models (AFT). The intersection of PH and AFT. Proportional odds models (PO). The intersection of PO and AFT. Recidivism in the U.S. 

Lab Exercises:

1.Lab exercise on parametric modelling pf survival data

2.Lab exercise on the proportional hazard model

3.Lab exercise on AFT models

 

Unit-3
Teaching Hours:15
Non-Parametric Survival Models
 

One-sample estimation with censored data. The Kaplan-Meier estimator. Greenwood's formula. The Nelson-Aalen estimator. The expectation of life. Comparison of several groups: Mantel- Haenszel and the log-rank test. 

Regression: Cox's model and partial likelihood. The score and information. The problem of ties. Tests of hypotheses. Time-varying covariates. Estimating the baseline survival. Martingale residuals.

Lab Exercises:

7.Lab exercise on Kaplan-Meier estimator and Nelson-Aalen estimator

8.Lab exercise on Mantel- Haenszel and the log-rank test

9.Lab exercise on the Cox model with time-varying covariate

Unit-4
Teaching Hours:15
Models for Discrete Data and Extensions
 

Cox's discrete logistic model and logistic regression. Modelling grouped continuous data and the complementary log-log transformation. Piece-wise constant hazards and Poisson regression. Current status data versus retrospective data. Open intervals and time since the last event. Backward recurrence times. Interval censoring. 

Lab Exercises:

10.Lab exercise on the discrete logistic model for survival data

11.Lab exercise on Poisson regression for survival data

12.Lab exercise on Piece-wise regression for survival data

Unit-5
Teaching Hours:15
Models for Competing Risks
 

Modelling multiple causes of failure. Research questions of interest. Cause-specific hazards. Overall survival. Cause-specific densities. Estimation: one-sample and the generalized Kaplan- Meier and Nelson-Aalen estimators. The Incidence function. Regression models. Weibull regression. Cox regression and partial likelihood. Piece-wise exponential survival and multinomial logits. The identification problem. Multivariate and marginal survival. The Fine-Gray model.

Lab Exercises:

13.Lab exercise on non-parametric modelling of competing risk data 

14.Lab exercise on parametric modelling of competing risk data

15.Lab exercise on multivariate survival data

Text Books And Reference Books:

1. Klein, J. P., & Moeschberger, M. L. (2006). Survival analysis: techniques for censored and truncated data. Springer Science & Business Media. 

2. Cleves, M.; W. G. Gould, and J. Marchenko (2016). An Introduction to Survival Analysis using Stata. Revised 3rd Ed. College Station, Texas: Stata Press. 

3. Kalbfleisch, J. D., & Prentice, R. L. (2011). The statistical analysis of failure time data,2nd Ed. John Wiley & Sons. 

4. Moore, D. F. (2016). Applied survival analysis using R. Switzerland: Springer. 

 

Essential Reading / Recommended Reading

1. Singer, J.D and J. B. Willett (2003) Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford, Oxford University Press. 

2. Therneau, T. M. and P. M. Grambsch (2000). Modelling Survival Data: Extending the Cox Model, Springer, NY

3. Collett, D. (2015). Modelling survival data in medical research. Chapman and Hall/CRC. 

 

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST273C - STATISTICAL QUALITY CONTROL (2020 Batch)

Total Teaching Hours for Semester:75
No of Lecture Hours/Week:5
Max Marks:150
Credits:4

Course Objectives/Course Description

 

This course provides an introduction to the application of statistical tools on the industrial environment to study, analyze and control the quality of products. 

Course Outcome

CO1: Demonstrate the concepts control charts to improve the quality standards of the process. 

CO2: Apply the idea of Sampling Plans to control the quality of industrial outputs. 

Unit-1
Teaching Hours:15
Statistical Process Control
 

Meaning and scope of statistical quality control - Causes of quality variation - Control charts for variables and attributes - Rational subgroups - Construction and operation of, σ, R, np, p, c and u charts - Operating characteristic curves of control charts. Process capability analysis using histogram, probability plotting and control chart - Process capability ratios and their interpretations. 

1.      Lab exercise on control charts for variables

2.      Lab exercise on control charts for attributes

3. Lab exercise on operating characteristic curve

Unit-2
Teaching Hours:15
Advanced Control Charts
 

Specification limits and tolerance limits - Modified control charts - Basic principles and design of cumulative-sum control charts – Concept of V-mask procedure – Tabular CUSUM charts. Construction of Moving range, moving-average and geometric moving-average control charts.

4.      Lab exercise on CUSUM charts

5.      Lab exercise on moving average charts

6. Lab exercise on geometric moving average charts

Unit-3
Teaching Hours:15
Statistical Product Control: Attributes
 

Acceptance sampling: Sampling inspection by attributes – single, double and multiple sampling plans – Rectifying Inspection. Measures of performance: OC, ASN, ATI and AOQ functions. Concepts of AQL, LTPD and IQL. Dodge – Romig and MIL-STD-105D tables 

7.      Lab exercise on single sampling scheme

8.      Lab exercise on double sampling scheme

9.      Lab exercise on Dodge-Romig sampling scheme

Unit-4
Teaching Hours:15
Statistical Product Control: Variables
 

Sampling inspection by variables - known and unknown sigma variables sampling plan - Merits and limitations of variables sampling plan - Derivation of OC curve – determination of plan parameters. 

10.      Lab exercise on variable sampling scheme with known variance

11.      Lab exercise on variable sampling scheme with unknown variance

12. Lab exercise on OC curves

Unit-5
Teaching Hours:15
Continuous Sampling Plans
 

Continuous sampling plans by attributes - CSP-1 and its modifications - concept of AOQL in CSPs - Multi-level continuous sampling plans - Operation of multi-level CSP of Lieberman and Solomon – Wald - Wolfowitz continuous sampling plans. Sequential Sampling Plans by attributes – Decision Lines - OC and ASN functions. 

13.      Lab exercise CSP-1

14.      Lab exercise on multi-level CSP

15. Lab exercise on sequential sampling plan

Text Books And Reference Books:

1. Montgomery, D. C. (2009). Introduction to Statistical Quality Control, Sixth Edition, Wiley India, New Delhi. 

2. Duncan, A. J. (2003.). Quality Control and Industrial Statistics, Irwin-Illinois, US. 

 

Essential Reading / Recommended Reading

1. Juran, J.M., and De Feo, J.A. (2010). Juran’s Quality control Handbook – The Complete Guide to Performance Excellence, Sixth Edition, Tata McGraw-Hill, New Delhi. 

2. Schilling, E. G., and Nuebauer, D.V. (2009). Acceptance Sampling in Quality Control, Second Edition, CRC Press, New York. 

3. Ross, S. M. (2009). Introduction to Probability Models, Tenth Edition, Academic Press, MA, US. 

 

 

Evaluation Pattern

CIA - 50%

ESE - 50%

Total - 100%

MST281 - RESEARCH MODELING AND IMPLEMENTATION (2020 Batch)

Total Teaching Hours for Semester:30
No of Lecture Hours/Week:2
Max Marks:50
Credits:1

Course Objectives/Course Description

 

The course will be inculcating research culture which will enhance the employability skills to the students. 

Course Outcome

CO1: Demonstrate the objective and data collection methodology for a research problem.

Unit-1
Teaching Hours:30
Problem Identification
 

Students will do the following,

1. Identify a domain for the research project

2. Literature survey 

3. Identifying the existing methodology and models

4. Writing a problem statement 

5. Project presentation at the end of the process

Text Books And Reference Books:

-

Essential Reading / Recommended Reading

-

Evaluation Pattern

CIA - 50%

ESE - 50%