index.xml

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>A Hugo website</title>
    <link>/</link>
    <description>Recent content on A Hugo website</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Thu, 28 Dec 2023 00:00:00 +0000</lastBuildDate><atom:link href="/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>An Introduction to Generalized Linear Models (4th edition)</title>
      <link>/2023/12/28/an-introduction-to-generalized-linear-models/</link>
      <pubDate>Thu, 28 Dec 2023 00:00:00 +0000</pubDate>
      
      <guid>/2023/12/28/an-introduction-to-generalized-linear-models/</guid>
      <description>Chapter2 Model Fitting2.5 ExercisesChapter3 Exponential Family and Generalized Linear ModelsExercisesChapter4 EstimationChapter5 InferenceChapter6 Normal Linear ModelsChapter7 Binary Variables and Logistic RegressionChapter8 Nominal and Ordinal Logistic Regression8.2 Multinomial distributionExercisesChapter9 Poisson Regression and Log-Linear Models9.2 Poisson regressionChapter10 Survival AnalysisChapter11 Clustered and Longitudinal DataChapter12 Bayesian AnalysisChapter13 Markov Chain Monte Carlo MethodsChapter14 Example Bayesian AnalysesChapter2 Model Fittinglibrary(dobson)library(ggprism)library(tidyverse)birthweight## # A tibble: 12 × 4## `boys gestational age` `boys weight` `girls gestational age` `girls weight`## &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;## 1 40 2968 40 3317## 2 38 2795 36 2729## 3 40 3163 40 2935## 4 35 2925 38 2754## 5 36 2625 42 3210## 6 37 2847 39 2817## 7 41 3292 40 3126## 8 40 3473 37 2539## 9 37 2628 36 2412## 10 38 3176 38 2991## 11 40 3421 39 2875## 12 38 2975 40 3231dim(birthweight)## [1] 12 4library(tidyverse)birthweight |&amp;gt; ggplot(aes(x=`boys gestational age`,y=`boys weight`)) + geom_point(shape=1, size=3) + geom_point(aes(x=`girls gestational age`, y=`girls weight`), shape=19, size=3) +theme_bw() + theme(# Hide panel borders and remove grid lines#panel.</description>
    </item>
    
    <item>
      <title>Common statistical tests are linear models</title>
      <link>/2023/09/24/common-statistical-tests-are-linear-models-or-how-to-teach-stats/</link>
      <pubDate>Sun, 24 Sep 2023 00:00:00 +0000</pubDate>
      
      <guid>/2023/09/24/common-statistical-tests-are-linear-models-or-how-to-teach-stats/</guid>
      <description>Interpretation of R’s lm() outputFive point summaryCoefficients and \(\hat{\beta_i}s\)\(t\)-statisticsResidual standard errorAdjusted \(R^2\)\(F\)-statisticThe simplicity underlying common testsSettings and toy dataPearson and Spearman correlationTheory: As linear modelsTheory: rank-transformationR code: Pearson correlationR code: Spearman correlationOne meanOne sample t-test and Wilcoxon signed-rankTheory: As linear modelsR code: One-sample t-testR code: Wilcoxon signed-rank testPaired samples t-test and Wilcoxon matched pairsTheory: As linear modelsR code: Paired sample t-testR code: Wilcoxon matched pairsTwo meansIndependent t-test and Mann-Whitney UTheory: As linear modelsTheory: Dummy codingTheory: Dummy coding (continued)R code: independent t-testR code: Mann-Whitney UWelch’s t-testThree or more meansOne-way ANOVA and Kruskal-WallisTheory: As linear modelsExample dataR code: one-way ANOVAR code: Kruskal-WallisTwo-way ANOVATheory: As linear modelsR code: Two-way ANOVAANCOVAProportions: Chi-square is a log-linear modelGoodness of fitTheory: As log-linear modelExample dataR code: Goodness of fitContingency tablesTheory: As log-linear modelExample dataR code: Chi-square testSources and further equivalencesExplicit GLM(M) Equivalents for Standard TestsExplicit GLM Test: PoissonBinomial Test: Logistic RegressionClassical Test: Exact Binomial TestExplicit GLM: Logit (or Probit)Probability density function of Logistic distributionProportion Test: (Multinomial) Logistic or Poisson ModelClassical Test: Test for Equality of ProportionsExplicit GLM: LogitExplicit GLM: PoissonClassical Test: Poisson TestReferences#share-buttons img {width: 40px;padding-right: 15px;border: 0;box-shadow: 0;display: inline;vertical-align: top;}# Options for building this documentknitr::opts_chunk$set(fig.</description>
    </item>
    
    <item>
      <title>Introducing Monte Carlo Methods with R</title>
      <link>/2023/08/28/introducing-monte-carlo-methods-with-r/</link>
      <pubDate>Mon, 28 Aug 2023 00:00:00 +0000</pubDate>
      
      <guid>/2023/08/28/introducing-monte-carlo-methods-with-r/</guid>
      <description>1. Basic R ProgrammingBivariate Normal distributionsthe t value of Pearson correlationKolmogorov-Smirnov Goodness-of-Fit TestShapiro–Wilk normality testWilcoxon signed rank testNewton’s method for calculating the square root1. Basic R Programminglibrary(MASS)e &amp;lt;- c(1:5)d &amp;lt;- c(6:10)e*d## [1] 6 14 24 36 50t(e)*d## [,1] [,2] [,3] [,4] [,5]## [1,] 6 14 24 36 50sum(t(e)*d)## [1] 130t(e)%*%d## [,1]## [1,] 130d%*%t(e)## [,1] [,2] [,3] [,4] [,5]## [1,] 6 12 18 24 30## [2,] 7 14 21 28 35## [3,] 8 16 24 32 40## [4,] 9 18 27 36 45## [5,] 10 20 30 40 50x1=matrix(1:20,nrow=5) #build the numeric matrix x1 of dimension#5  4 with rst row 1, 6, 11, 16x2=matrix(1:20,nrow=5,byrow=T) #build the numeric matrix x2 of dimension#5  4 with rst row 1, 2, 3, 4x3=t(x2) #transpose the matrix x2b=x3%*%x2b## [,1] [,2] [,3] [,4]## [1,] 565 610 655 700## [2,] 610 660 710 760## [3,] 655 710 765 820## [4,] 700 760 820 880sum(b)## [1] 11380c=x2%*%x3c## [,1] [,2] [,3] [,4] [,5]## [1,] 30 70 110 150 190## [2,] 70 174 278 382 486## [3,] 110 278 446 614 782## [4,] 150 382 614 846 1078## [5,] 190 486 782 1078 1374sum(c)## [1] 11150m &amp;lt;- runif(16)m2 &amp;lt;- matrix(m, nrow=4)m2## [,1] [,2] [,3] [,4]## [1,] 0.</description>
    </item>
    
    <item>
      <title>Modern Statistics for Modern Biology-2</title>
      <link>/2023/04/29/modern-statistics-for-modern-biology-2/</link>
      <pubDate>Sat, 29 Apr 2023 00:00:00 +0000</pubDate>
      
      <guid>/2023/04/29/modern-statistics-for-modern-biology-2/</guid>
      <description>5 Clustering6 Testing7 Multivariate Analysis8 High-Throughput Count Data9 Multivariate methods for heterogeneous data5 Clustering## -----------------------------------------------------------------------------library(&amp;quot;MASS&amp;quot;)library(&amp;quot;RColorBrewer&amp;quot;)set.seed(101)n &amp;lt;- 60000S1=matrix(c(1,.72,.72,1), ncol=2)S2=matrix(c(1.5,-0.6,-0.6,1.5),ncol=2)mu1=c(.5,2.5)mu2=c(6.5,4)X1 = mvrnorm(n, mu=c(.5,2.5), Sigma=matrix(c(1,.72,.72,1), ncol=2))X2 = mvrnorm(n,mu=c(6.5,4), Sigma=matrix(c(1.5,-0.6,-0.6,1.5),ncol=2))# A color palette from blue to yellow to redk = 11my.cols &amp;lt;- rev(brewer.pal(k, &amp;quot;RdYlBu&amp;quot;))plot(X1, xlim=c(-4,12),ylim=c(-2,9), xlab=&amp;quot;Orange&amp;quot;, ylab=&amp;quot;Red&amp;quot;, pch=&amp;#39;.</description>
    </item>
    
    <item>
      <title>Modern Statistics for Modern Biology-3</title>
      <link>/2023/04/29/modern-statistics-for-modern-biology-3/</link>
      <pubDate>Sat, 29 Apr 2023 00:00:00 +0000</pubDate>
      
      <guid>/2023/04/29/modern-statistics-for-modern-biology-3/</guid>
      <description>10 Networks and Trees11 Image data12 Supervised Learning13 Design of High Throughput Experiments and their Analyses10 Networks and Trees## -----------------------------------------------------------------------------dats = read.table(&amp;quot;../data/small_chemokine.txt&amp;quot;, header = TRUE)library(&amp;quot;ggtree&amp;quot;)## ggtree v3.4.0 For help: https://yulab-smu.top/treedata-book/## ## If you use the ggtree package suite in published research, please cite## the appropriate paper(s):## ## Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam.</description>
    </item>
    
    <item>
      <title>Modern Statistics for Modern Biology-1</title>
      <link>/2023/04/28/modern-statistics-for-modern-biology-1/</link>
      <pubDate>Fri, 28 Apr 2023 00:00:00 +0000</pubDate>
      
      <guid>/2023/04/28/modern-statistics-for-modern-biology-1/</guid>
      <description>1 Generative Models for Discrete Data2 Statistical Modeling3 High Quality Graphics in R4 Mixture Models1 Generative Models for Discrete Data## -----------------------------------------------------------------------------dpois(x = 3, lambda = 5)## [1] 0.1403739## -----------------------------------------------------------------------------.oldopt = options(digits = 2)0:12## [1] 0 1 2 3 4 5 6 7 8 9 10 11 12dpois(x = 0:12, lambda = 5)## [1] 0.</description>
    </item>
    
    <item>
      <title>Neural Networks and Deep Learning</title>
      <link>/2022/01/16/neural-networks-and-deep-learning/</link>
      <pubDate>Sun, 16 Jan 2022 00:00:00 +0000</pubDate>
      
      <guid>/2022/01/16/neural-networks-and-deep-learning/</guid>
      <description>1. Using neural nets to recognize handwritten digits2. How the backpropagation algorithm works3. Improving the way neural networks learn3.1 The sigmoid output and cross-entropy cost function3.2 Overfitting and regularization3.3 Weight initialization3.5 How to choose a neural network’s hyper-parameters?4. A visual proof that neural nets can compute any function5. Why are deep neural networks hard to train?6. Deep learning6.</description>
    </item>
    
    <item>
      <title>Longitudinal Analysis</title>
      <link>/2021/10/27/longitudinal-analysis/</link>
      <pubDate>Wed, 27 Oct 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/10/27/longitudinal-analysis/</guid>
      <description>3. Overview of Linear Models for Longitudinal DataReferences3. Overview of Linear Models for Longitudinal DataReferences1. Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis. John Wiley &amp;amp; Sons; 2012.2. Singer JD, Willett JB, Willett JB, et al. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford university press; 2003.</description>
    </item>
    
    <item>
      <title>Stochastic Processes</title>
      <link>/2021/10/25/stochastic-processes/</link>
      <pubDate>Mon, 25 Oct 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/10/25/stochastic-processes/</guid>
      <description>1. Measure and IntegrationMeasurable SpacesMeasurable FunctionsMeasuresIntegrationTransforms and Indefinite IntegralsKernels and Product Spaces2. Probability SpacesProbability Spaces and RandomExpectationsLp-spaces and Uniform IntegrabilityInformation and DeterminabilityIndependence3. Convergence4. Conditioning5. Martingales and Stochastics6. Poisson Random Measures7. L´evy Processes8. Brownian Motion9. Markov ProcessesReferences1. Measure and IntegrationMeasurable SpacesMonotone class theoremLet \(\mathcal C\) be a class of subset closed under finite intersections and containing \(\Omega\) (that is, \(\mathcal C\) is a \(\pi\)-system).</description>
    </item>
    
    <item>
      <title>Probability</title>
      <link>/2021/10/10/probability/</link>
      <pubDate>Sun, 10 Oct 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/10/10/probability/</guid>
      <description>ReferencesReferences1. Blitzstein JK, Hwang J. Introduction to probability. Chapman; Hall/CRC; 2019.</description>
    </item>
    
    <item>
      <title>Statistical Inference</title>
      <link>/2021/09/05/statistical-inference/</link>
      <pubDate>Sun, 05 Sep 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/09/05/statistical-inference/</guid>
      <description>1. Probability TheorySet TheoryBasics of Probability TheoryConditional Probability and IndependenceRandom VariablesDistribution FunctionsDensity and Mass Functions2. Transformations and ExpectationsDistributions of Functions of a Random Variable Theorem3. Common Families of DistributionsContinuouS DistributionsGamma DistributionNormal DistributionChi-Squared DistributionStudent’s \(t\)-DistributionSnedcor’s \(F\)-DistributionMultinomial DistributionExponential FamiliesLocation and Scale FamiliesInequalities and Identities4.</description>
    </item>
    
    <item>
      <title>Generalized Linear Models</title>
      <link>/2021/08/10/generalized-linear-models/</link>
      <pubDate>Tue, 10 Aug 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/08/10/generalized-linear-models/</guid>
      <description>1. Introduction to Linear and Generalized Linear ModelsReferences1. Introduction to Linear and Generalized Linear ModelsReferences1. Neter J, Kutner MH, Nachtsheim CJ, Wasserman W, others. Applied linear statistical models. 1996.2. Agresti A. Foundations of linear and generalized linear models. John Wiley &amp;amp; Sons; 2015.</description>
    </item>
    
    <item>
      <title>Hands-on Machine Learning: Keras-TensorFlow</title>
      <link>/2021/06/21/hands-on-machine-learning-keras/</link>
      <pubDate>Mon, 21 Jun 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/06/21/hands-on-machine-learning-keras/</guid>
      <description>Chapter 10 – Introduction to Artificial Neural Networks with KerasPerceptronsThe Multilayer Perceptron (MLP) and BackpropagationActivation functionsRegression MLPClassification MLPsImplementing MLPs with KerasBuilding an Image Classifier Using the Sequential APIBuilding Complex Models Using the Functional APIUsing the Subclassing API to Build Dynamic ModelsSaving and Restoring a ModelUsing Callbacks during TrainingUsing TensorBoard for VisualizationFine-Tuning Neural Network HyperparametersExercise solutionsChapter 11 – Training Deep Neural NetworksVanishing/Exploding Gradients ProblemGlorot and He InitializationNonsaturating Activation FunctionsBatch NormalizationImplement batch normalization with kerasGradient ClippingReusing Pretrained LayersTransfer Learning with KerasFaster OptimizersMomentum optimizationNesterov Accelerated GradientAdaGradRMSPropAdam OptimizationAdamax OptimizationNadam OptimizationLearning Rate SchedulingPower SchedulingExponential SchedulingPiecewise Constant SchedulingPerformance Schedulingtf.</description>
    </item>
    
    <item>
      <title>Hands-on Machine Learning: Scikit-Learn</title>
      <link>/2021/06/06/hands-on-machine-learning/</link>
      <pubDate>Sun, 06 Jun 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/06/06/hands-on-machine-learning/</guid>
      <description>Chapter 1 – The Machine Learning landscapeExample 1-1. Training and running a linear model using Scikit-LearnExercisesChapter 2 – End-to-end Machine Learning projectWorking with Real DataLook at the Big PictureDiscover and visualize the data to gain insightsPrepare the data for Machine Learning algorithmsSelect and train a modelFine-Tune Your ModelExtra materialA full pipeline with both preparation and predictionModel persistence using joblibExample SciPy distributions for RandomizedSearchCVExercise solutions1.</description>
    </item>
    
    <item>
      <title>Introduction to Algorithms: Foundations</title>
      <link>/2021/06/04/introduction-to-algorithms/</link>
      <pubDate>Fri, 04 Jun 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/06/04/introduction-to-algorithms/</guid>
      <description>1 The Role of Algorithms in Computing1.1 AlgorithmsExercises1.1-11.1-21.1-31.1-41.1-51.2 Algorithms as a technologyExercises1.2-11.2-21.2-3Problems1-1 Comparison of running timesReferences1 The Role of Algorithms in ComputingWhat are algorithms? Why is the study of algorithms worthwhile? What is the roleof algorithms relative to other technologies used in computers?</description>
    </item>
    
    <item>
      <title>Structure and Interpretation of Computer Programs (SICP)</title>
      <link>/2021/06/03/structure-and-interpretation-of-computer-programs-sicp/</link>
      <pubDate>Thu, 03 Jun 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/06/03/structure-and-interpretation-of-computer-programs-sicp/</guid>
      <description>ReferencesReferenceshttps://github.com/rmculpepper/iracket
1. Abelson H, Sussman GJ. Structure and interpretation of computer programs. The MIT Press; 1996.</description>
    </item>
    
    <item>
      <title>Bayesian Thinking</title>
      <link>/2021/05/25/bayesian-thinking/</link>
      <pubDate>Tue, 25 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/25/bayesian-thinking/</guid>
      <description>1 The Basics of Bayesian StatisticsBayes’ RuleConditional Probabilities &amp;amp; Bayes’ RuleBayes’ Rule and Diagnostic TestingBayes UpdatingBayesian vs. Frequentist Definitions of ProbabilityInference for a Proportion: Frequentist ApproachInference for a Proportion: Bayesian ApproachEffect of Sample Size on the PosteriorFrequentist vs. Bayesian Inference2 Bayesian InferenceContinuous Variables and Eliciting Probability DistributionsFrom the Discrete to the ContinuousElicitationConjugacyInference on a Binomial ProportionThe Gamma-Poisson Conjugate FamiliesThe Normal-Normal Conjugate FamiliesNon-Conjugate PriorsCredible IntervalsPredictive Inference3 Losses and Decision MakingBayesian Decision MakingPosterior Probabilities of Hypotheses and Bayes Factors4 Inference and Decision-Making with Multiple ParametersConjugate Prior for \(\mu\) and \(\sigma^2\)Conjugate Posterior DistributionMarginal Distribution for \(\mu\): Student \(t\)Credible Intervals for \(\mu\)Example: TTHM in TapwaterSection Summary(Optional) DerivationsMonte Carlo InferenceMonte Carlo SamplingMonte Carlo Inference: Tap Water ExampleMonte Carlo Inference for Functions of ParametersSummaryPrior Predictive DistributionTap Water Example (continued)Sampling from the Prior Predictive in RPosterior PredictiveSummaryMarkov Chain Monte Carlo (MCMC)5 Hypothesis Testing with Normal PopulationsBayes Factors for Testing a Normal Mean: variance knownComparing Two Paired Means using Bayes FactorsComparing Independent Means: Hypothesis TestingInference after Testing6 Introduction to Bayesian RegressionBayesian Simple Linear RegressionFrequentist Ordinary Least Square (OLS) Simple Linear RegressionBayesian Simple Linear Regression Using the Reference PriorInformative Priors(Optional) Derivations of Marginal Posterior Distributions of \(\alpha\), \(\beta\), \(\sigma^2\)Marginal Posterior Distribution of \(\beta\)Marginal Posterior Distribution of \(\alpha\)Marginal Posterior Distribution of \(\sigma^2\)Joint Normal-Gamma Posterior DistributionsPosterior Distribution of \(\epsilon_j\) Conditioning On \(\sigma^2\)Implementation Using BAS PackageBayesian Multiple Linear RegressionThe ModelData Pre-processingSpecify Bayesian Prior DistributionsFitting the Bayesian ModelPosterior Means and Posterior Standard DeviationsCredible Intervals SummarySummary7 Bayesian Model ChoiceDefinition of BICBackward Elimination with BICCoefficient Estimates Under Reference Prior for Best BIC ModelOther CriteriaModel UncertaintyCalculating Posterior Probability in RBayesian Model AveragingVisualizing Model UncertaintyBayesian Model Averaging Using Posterior ProbabilityCoefficient Summary under BMASummary8 Stochastic Explorations Using MCMCMarkov Chain Monte Carlo ExplorationOther Priors for Bayesian Model UncertaintyZellner’s \(g\)-PriorBayes Factor of Zellner’s \(g\)-PriorKid’s Cognitive Score ExampleThe UScrime Data Set and Data ProcessingBayesian Models and DiagnosticsPosterior Uncertainty in CoefficientsPredictionModel ChoicePrediction with New DataSummaryReferences1 The Basics of Bayesian StatisticsBayesian statistics mostly involves conditional probability, which is the the probability of an event A given event B, and it can be calculated using the Bayes rule.</description>
    </item>
    
    <item>
      <title>AOS chapter25 Simulation Methods</title>
      <link>/2021/05/24/aos-chapter25-simulation-methods/</link>
      <pubDate>Mon, 24 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/24/aos-chapter25-simulation-methods/</guid>
      <description>25. Simulation Methods25.1 Bayesian Inference Revisited25.2 Basic Monte Carlo Integration25.3 Importance Sampling25.4 MCMC Part I: The Metropolis-Hastings Algorithm25.5 MCMC Part II: Different Flavors25.7 ExercisesReferences25. Simulation MethodsIn this chapter we will see that by generating data in a clever way, we can solve a number of problems such as integrating or maximizing a complicated function. For integration, we will study 3 methods:</description>
    </item>
    
    <item>
      <title>AOS chapter24 Stochastic Processes</title>
      <link>/2021/05/22/aos-chapter24-stochastic-processes/</link>
      <pubDate>Sat, 22 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/22/aos-chapter24-stochastic-processes/</guid>
      <description>24. Stochastic Processes24.1 Introduction24.2 Markov Chains24.3 Poisson Process24.6 ExercisesReferences24. Stochastic Processes24.1 IntroductionA stochastic process \(\{ X_t : t \in T \}\) is a collection of random variables. We shall sometimes write \(X(t)\) instead of \(X_t\). The variables \(X_t\) take values in some set \(\mathcal{X}\) called the state space. The set \(T\) is called the index set and for our purposes can be thought of as time.</description>
    </item>
    
    <item>
      <title>AOS chapter23 Classification</title>
      <link>/2021/05/20/aos-chapter23-classification/</link>
      <pubDate>Thu, 20 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/20/aos-chapter23-classification/</guid>
      <description>23. Classification23.1 Introduction23.2 Error Rates and The Bayes Classifier23.3 Gaussian and Linear Classifiers23.4 Linear Regression and Logistic Regression23.5 Relationship Between Logistic Regression and LDA23.6 Density Estimation and Naive Bayes23.7 Trees23.8 Assessing Error Rates and Choosing a Good Classifier23.9 Support Vector Machines23.10 Kernelization23.11 Other Classifiers23.13 ExercisesReferences23. Classification23.1 IntroductionThe problem of predicting a discrete variable \(Y\) from another random variable \(X\) is called classfication, supervised learning, discrimination or pattern recognition.</description>
    </item>
    
    <item>
      <title>AOS chapter22 Smoothing Using Orthogonal Functions</title>
      <link>/2021/05/17/aos-chapter22-smoothing-using-orthogonal-functions/</link>
      <pubDate>Mon, 17 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/17/aos-chapter22-smoothing-using-orthogonal-functions/</guid>
      <description>22. Smoothing Using Orthogonal Functions22.1 Orthogonal Functions and \(L_2\) Spaces22.2 Density Estimation22.3 Regression22.4 Wavelets22.6 ExercisesReferences22. Smoothing Using Orthogonal FunctionsIn this Chapter we study a different approach to nonparametric curve estimation based on orthogonal functions. We begin with a brief introduction to the theory of orthogonal functions. Then we turn to density estimation and regression.
22.1 Orthogonal Functions and \(L_2\) SpacesLet \(v = (v_1, v_2, v_3)\) denote a three dimensional vector.</description>
    </item>
    
    <item>
      <title>AOS chapter21 Nonparametric Curve Estimation</title>
      <link>/2021/05/09/aos-chapter21-nonparametric-curve-estimation/</link>
      <pubDate>Sun, 09 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/09/aos-chapter21-nonparametric-curve-estimation/</guid>
      <description>21. Nonparametric Curve Estimation21.1 The Bias-Variance Tradeoff21.2 Histograms21.3 Kernel Density Estimation21.4 Nonparametric Regression21.5 Appendix: Confidence Sets and Bias21.7 ExercisesReferences21. Nonparametric Curve EstimationIn this Chapter we discuss the nonparametric estimation of probability density functions and regression functions, which we refer to a curve estimation.
In Chapter 8 we saw it is possible to consistently estimate a cumulative distribution function \(F\) without making any assumptions about \(F\).</description>
    </item>
    
    <item>
      <title>Density Estimation</title>
      <link>/2021/05/05/density-estimation/</link>
      <pubDate>Wed, 05 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/05/density-estimation/</guid>
      <description>1. INTROUCTION2. SURVEY OF EXISTING METHODS2.1 Introduction2.2. Histograms2.3. The naive estimator2.4. The kernel estimator2.5. The nearest neighbour method2.6. The variable kernel method2.7. Orthogonal series estimators2.8. Maximum penalized likelihood estimators2.9. General weight function estimators1. INTROUCTIONReferences1. INTROUCTIONThe probability density function is a fundamental concept in statistics. Consider any random quantity \(X\) that has probability density function \(f\).</description>
    </item>
    
    <item>
      <title>AOS chapter20 Directed Graphs</title>
      <link>/2021/05/03/aos-chapter20-directed-graphs/</link>
      <pubDate>Mon, 03 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/03/aos-chapter20-directed-graphs/</guid>
      <description>20. Directed Graphs20.1 Introduction20.2 DAG’s20.3 Probability and DAG’s20.4 More Independence Relations20.5 Estimation for DAG’s20.6 Causation Revisited20.8 Exercises20. Directed Graphs20.1 IntroductionDirected graphs are similar to undirected graphs, but there are arrows between vertices instead of edges. Like undirected graphs, directed graphs can be used to represent independence relations. They can also be used as an alternative to counterfactuals to represent causal relationships.</description>
    </item>
    
    <item>
      <title>AOS chapter19 Causal Inference</title>
      <link>/2021/05/01/aos-chapter19-causal-inference/</link>
      <pubDate>Sat, 01 May 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/05/01/aos-chapter19-causal-inference/</guid>
      <description>19. Causal Inference19.1 The Counterfactual Model19.2 Beyond Binary Treatments19.3 Observational Studies and Confounding19.4 Simpson’s Paradox19.6 ExercisesReferences19. Causal InferenceIn this chapter we discuss causation. Roughly speaking “\(X\) causes \(Y\)” means that changing the value of \(X\) will change the distribution of \(Y\). When \(X\) causes \(Y\), \(X\) and \(Y\) will be associated but the reverse is not, in general, true.</description>
    </item>
    
    <item>
      <title>AOS chapter18 Loglinear Models</title>
      <link>/2021/04/30/aos-chapter18-loglinear-models/</link>
      <pubDate>Fri, 30 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/30/aos-chapter18-loglinear-models/</guid>
      <description>18. Loglinear Models18.1 The Loglinear Model18.2 Graphical Log-Linear Models18.3 Hierarchical Log-Linear Models18.4 Model Generators18.5 Lattices18.6 Fitting Log-Linear Models to Data18.8 ExercisesReferences18. Loglinear Models18.1 The Loglinear ModelLet \(X = (X_1, \dots, X_m)\) be a random vector with probability
\[ f(x) = \mathbb{P}(X = x) = \mathbb{P}(X_1 = x_1, \dots, X_m = x_m) \]</description>
    </item>
    
    <item>
      <title>AOS chapter17 Undirected Graphs and Conditional Independence</title>
      <link>/2021/04/26/aos-chapter16-undirected-graphs-and-conditional-independence/</link>
      <pubDate>Mon, 26 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/26/aos-chapter16-undirected-graphs-and-conditional-independence/</guid>
      <description>17. Undirected Graphs and Conditional Independence17.1 Conditional Independence17.2 Undirected Graphs17.3 Probability and Graphs17.4 Fitting Graphs to Data17.6 ExercisesReferences17. Undirected Graphs and Conditional Independence\(k\) binary variables \(Y_1, \dots, Y_k\) correspond to a multinomial with \(N = 2^k\) categories. Even for moderately large \(k\), \(2^k\) will be huge. It can be shown in this case that the MLE is a poor estimator, because the data are sparse.</description>
    </item>
    
    <item>
      <title>AOS chapter16 Inference about Independence</title>
      <link>/2021/04/25/aos-chapter16-inference-about-independence/</link>
      <pubDate>Sun, 25 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/25/aos-chapter16-inference-about-independence/</guid>
      <description>16. Inference about Independence16.1 Two Binary Variables16.2 Interpreting the Odds Ratios16.3 Two Discrete Variables16.4 Two Continuous Variables16.5 One Continuous Variable and One Discrete16.7 ExercisesReferences16. Inference about IndependenceThis chapter addresses two questions:
How do we test if two random variables are independent?How do we estimate the strength of dependence between two random variables?Recall we write \(Y \text{ ⫫ } Z\) to mean that \(Y\) and \(Z\) are independent.</description>
    </item>
    
    <item>
      <title>AOS chapter15 Multivariate Models</title>
      <link>/2021/04/24/aos-chapter15-multivariate-models/</link>
      <pubDate>Sat, 24 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/24/aos-chapter15-multivariate-models/</guid>
      <description>15. Multivariate Models15.1 Random Vectors15.2 Estimating the Correlation15.3 Multinomial15.4 Multivariate Normal15.5 Appendix15.6 Exercisesbox_mullerReferences15. Multivariate ModelsReview of notation from linear algebra:
If \(x\) and \(y\) are vectors, then \(x^T y = \sum_j x_j y_j\).If \(A\) is a matrix then \(\text{det}(A)\) denotes the determinant of \(A\), \(A^T\) denotes the transpose of A, and \(A^{-1}\) denotes the inverse of \(A\) (if the inverse exists).</description>
    </item>
    
    <item>
      <title>AOS chapter14 Linear Regression</title>
      <link>/2021/04/23/aos-chapter14-linear-regression/</link>
      <pubDate>Fri, 23 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/23/aos-chapter14-linear-regression/</guid>
      <description>14. Linear Regression14.1 Simple Linear Regression14.2 Least Squares and Maximum Likelihood14.3 Properties of the Least Squares Estimators14.4 Prediction14.5 Multiple Regression14.6 Model Selection14.7 The Lasso14.8 Technical Appendix14.9 ExercisesReferences14. Linear RegressionRegression is a method for studying the relationship between a response variable \(Y\) and a covariates \(X\). The covariate is also called a predictor variable or feature.</description>
    </item>
    
    <item>
      <title>AOS chapter13 Statistical Decision Theory</title>
      <link>/2021/04/22/aos-chapter13-statistical-decision-theory/</link>
      <pubDate>Thu, 22 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/22/aos-chapter13-statistical-decision-theory/</guid>
      <description>13. Statistical Decision Theory13.1 Preliminaries13.2 Comparing Risk Functions13.3 Bayes Estimators13.4 Minimax Rules13.5 Maximum Likelihood, Minimax and Bayes13.6 Admissibility13.7 Stein’s Paradox13.9 ExercisesReferences13. Statistical Decision Theory13.1 PreliminariesDecision theory is a formal theory for comparing between statistical procedures.
In the language of decision theory, a estimator is sometimes called a decision rule and the possible values of the decision rule are called actions.</description>
    </item>
    
    <item>
      <title>AOS chapter12 Bayesian Inference</title>
      <link>/2021/04/21/aos-chapter12-bayesian-inference/</link>
      <pubDate>Wed, 21 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/21/aos-chapter12-bayesian-inference/</guid>
      <description>12. Bayesian Inference12.1 Bayesian Philosophy12.2 The Bayesian Method12.3 Functions of Parameters12.4 Simulation12.5 Large Sample Properties for Bayes’ Procedures12.6 Flat Priors, Improper Priors and “Noninformative” Priors12.7 Multiparameter Problems12.8 Strenghts and Weaknesses of Bayesian Inference12.9 Appendix12.11 ExercisesReferences12. Bayesian Inference12.1 Bayesian PhilosophyPostulates of frequentist (or classical) inference:
Probabilty refers to limiting relative frequencies.</description>
    </item>
    
    <item>
      <title>AOS Chapter11 Hypothesis Testing and p-values</title>
      <link>/2021/04/20/aos-chapter11-hypothesis-testing-and-p-values/</link>
      <pubDate>Tue, 20 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/20/aos-chapter11-hypothesis-testing-and-p-values/</guid>
      <description>11. Hypothesis Testing and p-values11.1 The Wald Test11.2 p-values11.3 The \(\chi^2\) distribution11.4 Pearson’s \(\chi^2\) Test for Multinomial Data11.5 The Permutation Test11.6 Multiple Testing11.7 Technical Appendix11.9 ExercisesReferences11. Hypothesis Testing and p-valuesSuppose we partition the parameters space \(\Theta\) into two disjoint sets \(\Theta_0\) and \(\Theta_1\) and we wish to test
\[H_0: \theta \in \Theta_0\quad \text{versus} \quadH_1: \theta \in \Theta_1\]</description>
    </item>
    
    <item>
      <title>AOS chapter10  Parametric Inference</title>
      <link>/2021/04/18/aos-chapter10-parametric-inference/</link>
      <pubDate>Sun, 18 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/18/aos-chapter10-parametric-inference/</guid>
      <description>10. Parametric Inference10.1 Parameter of interest10.2 The Method of Moments10.3 Maximum Likelihood10.4 Properties of Maximum Likelihood Estimators10.5 Consistency of Maximum Likelihood Estimator10.6 Equivalence of the MLE10.7 Asymptotic Normality10.8 Optimality10.9 The Delta Method10.10 Multiparameter Models10.11 The Parametric Bootstrap10.12 Technical Appendix10.13 ExercisesReferences10. Parametric InferenceParametric models are of the form</description>
    </item>
    
    <item>
      <title>AOS Chapter09 The Bootstrap</title>
      <link>/2021/04/17/aos-chapter09-the-bootstrap/</link>
      <pubDate>Sat, 17 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/17/aos-chapter09-the-bootstrap/</guid>
      <description>9. The Bootstrap9.1 Simulation9.2 Bootstrap Variance Estimation9.3 Bootstrap Confidence Intervals9.5 Technical Appendix9.6 ExercisesReferences9. The BootstrapLet \(X_1, \dots, X_n \sim F\) be random variables distributed according to \(F\), and
\[ T_n = g(X_1, \dots, X_n)\]
be a statistic, that is, any function of the data. Suppose we want to know \(\mathbb{V}_F(T_n)\), the variance of \(T_n\).
For example, if \(T_n = n^{-1}\sum_{i=1}^nX_i\) then \(\mathbb{V}_F(T_n) = \sigma^2/n\) where \(\sigma^2 = \int (x - \mu)^2dF(x)\) and \(\mu = \int x dF(x)\).</description>
    </item>
    
    <item>
      <title>AOS Chapter08 Estimating the CDF and Statistical Functionals</title>
      <link>/2021/04/16/aos-chapter08-estimating-the-cdf-and-statistical-functionals/</link>
      <pubDate>Fri, 16 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/16/aos-chapter08-estimating-the-cdf-and-statistical-functionals/</guid>
      <description>8. Estimating the CDF and Statistical Functionals8.1 Empirical distribution function8.2 Statistical Functionals8.3 Technical Appendix8.5 ExercisesReferences8. Estimating the CDF and Statistical Functionals8.1 Empirical distribution functionThe empirical distribution function \(\hat{F_n}\) is the CDF that puts mass \(1/n\) at each data point \(X_i\). Formally,
\[\begin{align}\hat{F_n}(x) &amp;amp; = \frac{\sum_{i=1}^n I\left(X_i \leq x \right)}{n} \\&amp;amp;= \frac{\text{#}|\text{observations less than or equal to x}|}{n}\end{align}\]</description>
    </item>
    
    <item>
      <title>AOS Chapter07 Models, Statistical Inference and Learning</title>
      <link>/2021/04/15/aos-chapter07-models-statistical-inference-and-learning/</link>
      <pubDate>Thu, 15 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/15/aos-chapter07-models-statistical-inference-and-learning/</guid>
      <description>7. Models, Statistical Inference and Learning7.2 Parametric and Nonparametric Models7.3 Fundamental Concepts in Inference7.5 Technical AppendixReferences7. Models, Statistical Inference and Learning7.2 Parametric and Nonparametric ModelsA statistical model is a set of distributions \(\mathfrak{F}\).
A parametric model is a set \(\mathfrak{F}\) that may be parametrized by a finite number of parameters. For example, if we assume that data comes from a normal distribution then</description>
    </item>
    
    <item>
      <title>AOS Chapter06 Convergence of Random Variables</title>
      <link>/2021/04/14/aos-chapter05-convergence-of-random-variables/</link>
      <pubDate>Wed, 14 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/14/aos-chapter05-convergence-of-random-variables/</guid>
      <description>6. Convergence of Random Variables6.2 Types of convergence6.3 The Law of Large Numbers6.4 The Central Limit Theorem6.5 The Delta Method6.6 Technical appendix6.8 ExercisesReferences6. Convergence of Random Variables6.2 Types of convergence\(X_n\) converges to \(X\) in probability, written \(X_n \xrightarrow{\text{P}} X\), if, for every \(\epsilon &amp;gt; 0\),:
\[ \mathbb{P}( |X_n - X| &amp;gt; \epsilon ) \rightarrow 0 \]</description>
    </item>
    
    <item>
      <title>AOS Chapter05 Inequalities</title>
      <link>/2021/04/13/aos-chapter05-inequalities/</link>
      <pubDate>Tue, 13 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/13/aos-chapter05-inequalities/</guid>
      <description>5. Inequalities5.1 Markov and Chebyshev Inequalities5.2 Hoeffding’s Inequality5.3 Cauchy-Schwartz and Jensen Inequalities5.4 Technical Appendix: Proof of Hoeffding’s Inequality5.6 ExercisesReferences5. Inequalities5.1 Markov and Chebyshev InequalitiesTheorem 5.1 (Markov’s Inequality). Let \(X\) be a non-negative random variable and suppose that \(\mathbb{E}(X)\) exists. For any \(t &amp;gt; 0\),
\[ \mathbb{P}(X &amp;gt; t) \leq \frac{\mathbb{E}(X)}{t} \]
Proof.
\[ \mathbb{E}(X)=\int_0^\infty xf(x) dx=\int_0^t xf(x) dx + \int_t^\infty xf(x) dx\geq \int_t^\infty xf(x) dx\geq t \int_t^\infty f(x) dx= t \mathbb{P}(X &amp;gt; t)\]</description>
    </item>
    
    <item>
      <title>AOS chapter04 Expectation, negative binomial distribution and gene counts, beta distribution and Order Statistics</title>
      <link>/2021/04/12/aos-chapter04-expectation/</link>
      <pubDate>Mon, 12 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/12/aos-chapter04-expectation/</guid>
      <description>4. Expectation4.1 Expectation of a Random Variable4.2 Properties of Expectations4.3 Variance and Covariance4.4 Expectation and Variance of Important Random Variables4.5 Conditional Expectation4.6 Technical Appendix4.7 Exercises4.8 Negative binomial (or gamma-Poisson) distribution and gene expression counts modeling4.9 Beta distribution and Order Statistics4.10 The conditional log-likelihood (CML) of NB distributionReferences4. Expectation4.1 Expectation of a Random VariableThe expected value, mean or first moment of \(X\) is defined to be</description>
    </item>
    
    <item>
      <title>AOS chapter03 Random Variables</title>
      <link>/2021/04/11/aos-chapter03-random-variables/</link>
      <pubDate>Sun, 11 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/11/aos-chapter03-random-variables/</guid>
      <description>3. Random Variables3.1 Introduction3.2 Distribution Functions and Probability Functions3.3 Some Important Discrete Random Variables3.4 Some Important Continuous Random Variables3.5 Bivariate Distributions3.6 Marginal Distributions3.7 Independent Random Variables3.8 Conditional Distributions3.9 Multivariate Distributions and IID Samples3.10 Two Important Multivariate Distributions3.11 Transformations of Random Variables3.12 Transformation of Several Random Variables3.13 Technical Appendix3.14 ExercisesReferences3.</description>
    </item>
    
    <item>
      <title>AOS chapter02 Probability</title>
      <link>/2021/04/09/aos-chapter02-probability/</link>
      <pubDate>Fri, 09 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/09/aos-chapter02-probability/</guid>
      <description>2. Probability2.2 Sample Spaces and Events2.3 Probability2.4 Probability on Finite Sample Spaces2.5 Independent Events2.6 Conditional Probability2.7 Bayes’ Theorem2.9 Technical Appendix2.10 ExercisesReferences2. Probability2.2 Sample Spaces and EventsThe sample space \(\Omega\) is the set of possible outcomes of an experiment. Points \(\omega\) in \(\Omega\) are called sample outcomes or realizations. Events are subsets of \(\Omega\).</description>
    </item>
    
    <item>
      <title>ESL chapter 4 Linear Methods for Classification</title>
      <link>/2021/04/03/esl-chapter-4-linear-methods-for-classification/</link>
      <pubDate>Sat, 03 Apr 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/04/03/esl-chapter-4-linear-methods-for-classification/</guid>
      <description>Chapter 4. Linear Methods for Classification\(\S\) 4.1. IntroductionLinear regressionDiscriminant functionLogit transformationSeparating hyperplanesScope for generalization\(\S\) 4.2. Linear Regression of an Indicator MatrixRationaleA more simplistic viewpointMasked class with the regression approach\(\S\) 4.3. Linear Discriminant AnalysisLDA from multivariate GaussianEstimating parametersSimple correspondence between LDA and linear regression with two classesPractice beyond the Gaussian assumptionQuadratic Discriminant AnalysisWhy LDA and QDA have such a good track record?</description>
    </item>
    
    <item>
      <title>ESL chapter 3 exercises</title>
      <link>/2021/03/23/esl-chapter-3-exercises/</link>
      <pubDate>Tue, 23 Mar 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/03/23/esl-chapter-3-exercises/</guid>
      <description>Ex. 3.9 (using the QR decomposition for fast forward-stepwise selection)Ex. 3.10 (using the z-scores for fast backwards stepwise regression)Ex. 3.11 (multivariate linear regression with different \(\Sigma_i\))Ex. 3.12 (ordinary least squares to implement ridge regression)Ex. 3.13 (principal component regression)Ex. 3.14 (when the inputs are orthogonal PLS stops after m = 1 step)Ex. 3.15 (PLS seeks directions that have high variance and high correlation)Relation to the optimization problemEx.</description>
    </item>
    
    <item>
      <title>ESL chapter 3 Linear Methods for Regression</title>
      <link>/2021/02/24/esl-chapter-3-linear-methods-for-regression/</link>
      <pubDate>Wed, 24 Feb 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/02/24/esl-chapter-3-linear-methods-for-regression/</guid>
      <description>Chapter 3. Linear Methods for Regression\(\S\) 3.1. Introduction\(\S\) 3.2. Linear Regression Models and Least SquaresThe linear modelLeast squares fitSolution of least squaresGeometrical representation of the least squares estimateSampling properties of \(\hat{\beta}\)Inference and hypothesis testingConfidence intervals\(\S\) 3.2.1. Example: Prostate Cancer\(\S\) 3.2.2. The Gauss-Markov TheoremThe statement of the theoremImplications of the Gauss-Markov theoremRelation between prediction accuracy and MSE\(\S\) 3.</description>
    </item>
    
    <item>
      <title>ESL chapter 2 Overview of Supervised Learning</title>
      <link>/2021/02/12/esl-chapter-2-overview-of-supervised-learning/</link>
      <pubDate>Fri, 12 Feb 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/02/12/esl-chapter-2-overview-of-supervised-learning/</guid>
      <description>\(\S\) Supervised Learning\(\S\) 2.3. Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors\(\S\) 2.3.3 From Least Squares to Nearest Neighbors\(\S\) 2.3.1. Linear Models and Least SquaresLinear ModelsHow to fit the model: Least squaresLinear model in a classification contextWhere the data came from?\(\S\) 2.3.2 Nearest-Neighbor MethodsDo not satisfy with the training resultsEffective number of parametersDo not appreciate the training errors\(\S\) 2.</description>
    </item>
    
    <item>
      <title>Single cell data analysis using scanpy</title>
      <link>/2021/02/03/single-cell-data-analysis-using-scanpy/</link>
      <pubDate>Wed, 03 Feb 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/02/03/single-cell-data-analysis-using-scanpy/</guid>
      <description>import numpy as npimport pandas as pdimport scanpy as sc# !mkdir data# !wget http://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz -O data/pbmc3k_filtered_gene_bc_matrices.tar.gz# !cd data; tar -xzf pbmc3k_filtered_gene_bc_matrices.tar.gz# !mkdir writesc.settings.verbosity = 3 # verbosity: errors (0), warnings (1), info (2), hints (3)sc.logging.print_header()sc.settings.set_figure_params(dpi=80, facecolor=&amp;#39;white&amp;#39;)scanpy==1.6.1 anndata==0.7.5 umap==0.4.6 numpy==1.19.2 scipy==1.5.2 pandas==1.2.1 scikit-learn==0.23.2 statsmodels==0.12.1 python-igraph==0.8.3 leidenalg==0.8.3results_file = &amp;#39;write/pbmc3k.h5ad&amp;#39; # the file that will store the analysis resultsadata = sc.</description>
    </item>
    
    <item>
      <title>The C Programming Language</title>
      <link>/2021/02/03/c/</link>
      <pubDate>Wed, 03 Feb 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/02/03/c/</guid>
      <description>CHAPTER 1: A Tutorial Introduction1.1 Getting StartedExercise 1-1 .Exercise 1-2 .1.2 Variables and Arithmetic ExpressionsExercise 1-3 .Exercise 1-4 .1.3 The For StatementExercise 1-5 .1.4 Symbolic Constants1.5 Character Input and Output1.5.1 File CopyingExercise 1-6 .Exercise 1-7 .1.5.2 Character Counting1.5.3 Line CountingExercise 1-8 . Write a program to count blanks, tabs, and newlines.</description>
    </item>
    
    <item>
      <title>Single cell RNA-seq data analysis using Markov Affinity-Based Graph Imputation</title>
      <link>/2021/01/28/single-cell-rna-seq-data-analysis-using-markov-affinity-based-graph-imputation/</link>
      <pubDate>Thu, 28 Jan 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/01/28/single-cell-rna-seq-data-analysis-using-markov-affinity-based-graph-imputation/</guid>
      <description>import magicimport pandas as pdimport matplotlib.pyplot as pltX = pd.read_csv(&amp;quot;test_data.csv&amp;quot;)X.shape(500, 197)magic_operator = magic.MAGIC()X_magic = magic_operator.fit_transform(X, genes=[&amp;#39;VIM&amp;#39;, &amp;#39;CDH1&amp;#39;, &amp;#39;ZEB1&amp;#39;])X_magic.shapeCalculating MAGIC...Running MAGIC on 500 cells and 197 genes.Calculating graph and diffusion operator...Calculating PCA...Calculated PCA in 0.02 seconds.Calculating KNN search...Calculated KNN search in 0.04 seconds.Calculating affinities...Calculated affinities in 0.03 seconds.Calculated graph and diffusion operator in 0.</description>
    </item>
    
    <item>
      <title>Algorithms in SICP</title>
      <link>/2021/01/22/algorithms-in-sicp/</link>
      <pubDate>Fri, 22 Jan 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/01/22/algorithms-in-sicp/</guid>
      <description>1 Building Abstractions with Procedures1.2 Procedures and the Processes They Generate1.2.1 Linear Recursion and Iteration1.3 Formulating Abstractions with Higher-Order Procedures1.3.1 Procedures as Arguments1.3.2 Constructing Procedures Using lambda1.3.3 Procedures as General Methods1.3.4 Procedures as Returned Values2 Building Abstractions with Data2.1 Introduction to Data Abstraction2.1.1 Example: Arithmetic Operations for Rational Numbers2.</description>
    </item>
    
    <item>
      <title>Topology</title>
      <link>/2021/01/17/topology/</link>
      <pubDate>Sun, 17 Jan 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/01/17/topology/</guid>
      <description>If \(X\) is a topological space with topology \(\mathscr T\), we say that a subset \(U\) of \(X\) is anopen set of \(X\) if \(U\) belongs to the collection \(\mathscr T\). Using this terminology, one can say that a topological space is a set \(X\) together with a collection of subsets of \(X\), called open sets, such that \(\varnothing\) and \(X\) are both open, and such that arbitrary unions and finite intersections of open sets are open.</description>
    </item>
    
    <item>
      <title>Banach space</title>
      <link>/2021/01/03/banach-space/</link>
      <pubDate>Sun, 03 Jan 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/01/03/banach-space/</guid>
      <description>A complex vector space \(X\) is said to be a normed linear space if to each \(x\in X\), there is associated a nonnegative real number \(\lVert x\rVert\), called the norm of \(x\), such that
\(\lVert x+y\rVert\leq\lVert x\rVert+\lVert y\rVert\) for all \(x\) and \(y\in X\),
\(\lVert ax\rVert=|a|\lVert x\rVert\) if \(x\in X\) and \(a\) is a scalar.
\(\lVert x\rVert=0\) implies \(x=0\).
Every normed linear space may be regarded as a metric space, the distance between \(x\) and \(y\) being \(\lVert x-y\rVert\).</description>
    </item>
    
    <item>
      <title>Trigonometric Series</title>
      <link>/2021/01/02/trigonometric-series/</link>
      <pubDate>Sat, 02 Jan 2021 00:00:00 +0000</pubDate>
      
      <guid>/2021/01/02/trigonometric-series/</guid>
      <description>Let \(T\) be the unit circle in the complex plane, i.e., the set of all complex numbers of absolute value \(1\). If \(F\) is any function on \(T\) and if \(f\) is defined on \(R^1\) by \[f(t)=F(e^{it})\] Then \(f\) is a periodic function of period \(2\pi\). Conversely, if \(f\) is a function on \(R^1\), with period \(2\pi\), then there is a function \(F\) on \(T\) such that \[f(t)=F(e^{it})\] holds.</description>
    </item>
    
    <item>
      <title>Hilbert Space</title>
      <link>/2020/12/31/hilbert-space/</link>
      <pubDate>Thu, 31 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/31/hilbert-space/</guid>
      <description>A complex vector space \(H\) is called an inner product space if to each ordered pair of vectors \(x,y\in H\) there is associated a complex number \((x,y)\), the so-called “inner product” of \(x\) and \(y\), such that the following rules hold:
\((a)\) \((y,x)=\overline{(x,y)}\). (The bar denotes complex conjugation.)
\((b)\) \((x+y,z)=(x,z)+(y,z),\quad x,y,z\in H\)
\((c)\) \((ax,y)=a(x,y),\quad x,y\in H, a\text{ is a scalar}\)
\((d)\) \((x,x)\ge0,\quad \forall x\in H\)
\((e)\) \((x,x)=0\) only if \(x=0\).</description>
    </item>
    
    <item>
      <title>L^p-Spaces</title>
      <link>/2020/12/29/l-p-spaces/</link>
      <pubDate>Tue, 29 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/29/l-p-spaces/</guid>
      <description>A real function \(\varphi\) defined on a segment \((a,b),\quad -\infty\leq a&amp;lt;b\leq \infty\), is called convex if the inequality \[\varphi((1-\lambda)x+\lambda y)\leq(1-\lambda)\varphi(x)+\lambda\varphi(y)\] or equivalently \[\frac{\varphi(t)-\varphi(s)}{t-s}\leq\frac{\varphi(u)-\varphi(t)}{u-t},\quad a&amp;lt;s&amp;lt;t&amp;lt;u&amp;lt;b\] holds whenever \(a&amp;lt;x,y&amp;lt;b,\quad 0\leq\lambda\leq1\). If \(x&amp;lt;t&amp;lt;y\), then the point \((t,\varphi(t))\) should lie below or on the connecting the points \((x,\varphi(x))\) and \((y,\varphi(y))\) in the plane.
If \(\varphi\) is convex on \((a, b)\) then \(\varphi\) is continuous on \((a, b)\).
Suppose \(a&amp;lt;s&amp;lt;x&amp;lt;y&amp;lt;t&amp;lt;b\), write \(S\) for the point \((s,\varphi(s))\) in the plane, and deal similarly with \(x\), \(y\), and \(t\).</description>
    </item>
    
    <item>
      <title>Positive Borel Measures</title>
      <link>/2020/12/22/positive-borel-measures/</link>
      <pubDate>Tue, 22 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/22/positive-borel-measures/</guid>
      <description>The support of a complex function \(f\) on a topological space \(X\) is the closure of the set \[\{x:f(x)\ne0\}\] The collection of all continuous complex functions on \(X\) whose support is compact is denoted by \(C_c(X)\), which is a vector space. The notation \[K\prec f\] means that \(K\) is a compact subset of \(X\), that \(f\in C_c(X),\;\;0\leq f(x)\leq 1\) for all \(x\in X\) and that \(f(x)=1\) for all \(x\in K\).</description>
    </item>
    
    <item>
      <title>Measures</title>
      <link>/2020/12/16/measures/</link>
      <pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/16/measures/</guid>
      <description>拓扑含空全交并，
开由开得即连续。
西代含空全补并，
开由可测即可测。
开由博雷即博雷，
可测博雷由可测。
Topology contains empty, whole, intersection and union sets,
continuous function maps onto open sets from open sets.
\(\sigma-slgebra\) contains empty, whole, complement and union sets,
measurable function maps onto open sets from measurable set.
Borel mapping maps onto open sets from Borel sets,
measurable function maps onto Borel sets from subset of \(\sigma-slgebra\) (measurable sets).
The empty set is \(\varnothing\). A collection \(\tau\) of subsets of a set \(X\) is said to be a topology in \(X\) if \(\tau\) has the following properties: (i) \(\varnothing\in\tau, X\in\tau\).</description>
    </item>
    
    <item>
      <title>Lebesgue Theory</title>
      <link>/2020/12/10/lebesgue-theory/</link>
      <pubDate>Thu, 10 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/10/lebesgue-theory/</guid>
      <description>\(A\) and \(B\) are two sets, we write \(A-B\) for the set of all elements \(x\) such that \(x\in A, x\notin B\). A family \(\mathscr R\) of set is called a ring if \(A\in\mathscr R, B\in\mathscr R\) implies \[A\cup B\in\mathscr R,\quad A-B\in\mathscr R, \quad A\cap B=A-(A-B)\in\mathscr R\] A ring \(\mathscr R\) is called a \(\sigma\)-ring if \[\overset{\infty}{\underset{n=1}{\bigcup}}A_n\in\mathscr R\] whenever \(A_n\in\mathscr R(n=1,2,\cdots)\). And if \(\mathscr R\) is a \(\sigma\)-ring, \[\overset{\infty}{\underset{n=1}{\bigcap}}A_n=A_1-\overset{\infty}{\underset{n=1}{\bigcup}}(A_1-A_n)\in\mathscr R\]</description>
    </item>
    
    <item>
      <title>Closed forms and exact forms</title>
      <link>/2020/12/08/closed-forms-and-exact-forms/</link>
      <pubDate>Tue, 08 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/08/closed-forms-and-exact-forms/</guid>
      <description>Let \(\omega\) be a \(k\)-form in an open set \(E\subset R^n\). If there is a \((k-1)\)-form \(\lambda\) in \(E\) such that \(\omega=d\lambda\), then \(\omega\) is said to be exact in \(E\). If \(\omega\) is of class \(\mathscr C&amp;#39;\) and \(d\omega=0\), then \(\omega\) is said to be closed.
If \(\omega\) is of class \(\mathscr C&amp;#39;&amp;#39;\) in \(E\), then \[d^2\omega=0\] For a \(0\)-form \(f\in\mathscr C&amp;#39;&amp;#39;(E)\) \[\begin{align}d^2f&amp;amp;=d\Biggl(\sum_{j=1}^{n}(D_jf)(\mathbf x)dx_j\Biggr)\\&amp;amp;=\sum_{j=1}^{n}d(D_jf)(\mathbf x)dx_j\\&amp;amp;=\sum_{i=1,j=1}^{n}(D_{ij}f)(\mathbf x)dx_i\land dx_j\\\end{align}\] Since \(D_{ij}f=D_{ji}f\) and \(dx_i\land dx_j=-dx_j\land dx_i\) so \[d^2\omega=(d^2f)\land dx_I=0\] Then every exact form of class \(\mathscr C&amp;#39;\) is closed.</description>
    </item>
    
    <item>
      <title>Stokes&#39; theorem</title>
      <link>/2020/12/07/stokes-theorem/</link>
      <pubDate>Mon, 07 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/07/stokes-theorem/</guid>
      <description>If \(\Psi\) is a \(k\)-chain of class \(\mathscr C&amp;#39;&amp;#39;\) in an open set \(V\subset R^m\) and if \(\omega\) is a \((k-1)\)-form of class \(\mathscr C&amp;#39;\) in \(V\), then \[\int_{\Psi}d\omega=\int_{\partial\Psi}\omega\]
The case \(k=m=1\) is the fundamental theorem of calculus: If \(f\in \mathscr R\) on \([a,b]\) and if there is a differentiable function \(F\) on \([a,b]\) such that \(F&amp;#39;=f\), then \[\int_{a}^{b}f(x)dx=F(b)-F(a)\] Let \(\varepsilon&amp;gt;0\), choose a partition \(P=\{x_0,\cdots,x_n\}\) of \([a,b]\) so that \[U(P,f)-L(P,f)&amp;lt;\varepsilon\] let point \(t_i\in[x_{i-1},x_i]\) such that \[F(x_i)-F(x_{i-1})=f(t_i)\Delta x_i\] for \(i=1,\cdots,n\) Thus \[\sum_{i=1}^{n}f(t_i)\Delta x_i=F(b)-F(a)\] Then \[\Biggl|F(b)-F(a)-\int_{a}^{b}f(x)dx\Biggr|\leq \Biggl|F(b)-F(a)-\sum_{i=1}^{n}f(t_i)\Delta x_i\Biggr|+\Biggl|\sum_{i=1}^{n}f(t_i)\Delta x_i-\int_{a}^{b}f(x)dx\Biggr|&amp;lt;0+\varepsilon=\varepsilon\] Then then \[\int_{a}^{b}f(x)dx=F(b)-F(a)\]</description>
    </item>
    
    <item>
      <title>Affine simplexes and chains</title>
      <link>/2020/12/04/affine-simplexes-and-chains/</link>
      <pubDate>Fri, 04 Dec 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/12/04/affine-simplexes-and-chains/</guid>
      <description>A mapping \(\mathbf f\) that carries a vector space \(X\) into a vector space \(Y\) is said to be affine if \[\mathbf f(\mathbf x)-\mathbf f(\mathbf 0)\] is linear, or in other words \[\mathbf f(\mathbf x)-\mathbf f(\mathbf 0)=A\mathbf x\quad A\in L(X,Y)\] The standard simplex \(Q^k\) is defined to be the set of all \(\mathbf u\in R^k\) of the form \[\mathbf u=\sum_{i=1}^{k}a_i\mathbf e_i\] where \(0\leq a_i, \sum a_i\leq 1, i=1,\cdots,k\).</description>
    </item>
    
    <item>
      <title>Differential Forms</title>
      <link>/2020/11/27/differential-forms/</link>
      <pubDate>Fri, 27 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/27/differential-forms/</guid>
      <description>Suppose \(I^k\) is a k-cell in \(R^k\) consisting of all \[\mathbf x=(x_1,\cdots,x_k)\] such that \(a_i\leq x_i\leq b_i\quad(i=1,\cdots,k)\) and \(f\) is a real continuous function on \(I^k\). Put \(f=f_k\) and define \[f_{k-1}(x_1,\cdots,x_{k-1})=\int_{a_k}^{b_k}f_{k}(x_1,\cdots,x_{k-1},x_k)dx_k\] We repeat this process \(k\) steps and obtain a function, which is defined \[L(f)=\int_{I^k}f(\mathbf x)d\mathbf x\] or \[\int_{I^k}f\] If \(L&amp;#39;(f)\) is the result obtained by carrying out the \(K\) integration in some other order, then for every \(f\in\mathscr C(I^k), L(f)=L&amp;#39;(f)\).</description>
    </item>
    
    <item>
      <title>Functions of several variables</title>
      <link>/2020/11/23/functions-of-several-variables/</link>
      <pubDate>Mon, 23 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/23/functions-of-several-variables/</guid>
      <description>Let \(L(X,Y)\) be the set of all linear transformations of the vector space \(X\) into the vector space \(Y\). For \(A\in L(R^n,R^m)\), define the norm \(\lVert A\rVert\) of \(A\) to be the sup of all numbers \(\lvert A\mathbf x\rvert\), where \(\mathbf x\) ranges over all vectors in \(R^n\) with \(\lvert x\rvert\leq1\) The inequality \[\lvert A\mathbf x\rvert\leq\lVert A\rVert\lvert \mathbf x\rvert\] holds for all \(\mathbf x\in R^n\). If \(\lambda\) is such that \[\lvert A\mathbf x\rvert\leq\lambda\lvert \mathbf x\rvert\] for all \(\mathbf x\in R^n\) then \(\lVert A\rVert\leq\lambda\).</description>
    </item>
    
    <item>
      <title>Fourier Series</title>
      <link>/2020/11/21/fourier-series/</link>
      <pubDate>Sat, 21 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/21/fourier-series/</guid>
      <description>\[\cos(x)=\frac{1}{2}(e^{ix}+e^{-ix})\]\[\sin(x)=\frac{1}{2i}(e^{ix}-e^{-ix})\]\[e^{ix}=\cos(x)+i\sin(x)\]\[|e^{ix}|^2=e^{ix}\overline{e^{ix}}=e^{ix}e^{-ix}=1\] then \[|e^{inx}|=1\] Let \(x_0\) be the smallest positive number such that \(\cos(x_0)=0\), we define the number \(\pi\) by \(\pi=2x_0\). Then \(\cos(\pi/2)=0\) Because \(|\cos(\pi/2)+i\sin(\pi/2)|=\sqrt{\cos^2(\pi/2)+\sin^2(\pi/2)}=1\) then \(\sin^2(\pi/2)=1\) Since \(\sin&amp;#39;(x)=\cos(x)&amp;gt;0\) in \((0,\pi/2)\), \(\sin(x)\) is increasing in \((0,\pi/2)\), hence \(\sin(\pi/2)=1\). Thus \[e^{\frac{\pi}{2}i}=\cos(\pi/2)+i\sin(\pi/2)=i\] \[e^{\pi i}=\cos(\pi)+i\sin(\pi)=-1\]\[e^{-\pi i}=\cos(-\pi)+i\sin(-\pi)=-1\]\[e^{2\pi i}=\cos(2\pi)+i\sin(2\pi)=1\]\[e^{z+2\pi i}=\cos(z+2\pi)+i\sin(z+2\pi)=\cos(z)+i\sin(z)=e^z\quad(\text{z complex})\] Then \(e^{ix}\) is periodic, with period \(2\pi i\).\[\int_{-\pi}^{\pi} e^{inx}dx=\frac{e^{inx}}{in}\Biggl|_{-\pi}^{\pi}=\begin{cases}2\pi &amp;amp; (\text{if } n=0) \\0 &amp;amp; (\text{if } n=\pm1,\pm2,\cdots)\end{cases}\]</description>
    </item>
    
    <item>
      <title>Exponential and Logarithmic function</title>
      <link>/2020/11/20/exponential-and-logarithmic-function/</link>
      <pubDate>Fri, 20 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/20/exponential-and-logarithmic-function/</guid>
      <description>\[e=\sum_{n=0}^{\infty}\frac{1}{n!}\quad(n=0,1,2,3,\cdots)\]
\[\begin{align}(1+\frac{1}{n})^n&amp;amp;=\sum_{m=0}^{n}{n \choose m}\cdot 1^{m}\cdot(\frac{1}{n})^{n-m}\\&amp;amp;=\sum_{m=0}^{n}{n \choose m}\cdot(\frac{1}{n})^{n-m}\\&amp;amp;={n \choose n}+{n \choose n-1}(\frac{1}{n})+{n \choose n-2}(\frac{1}{n})^2+\cdots+{n \choose 0}(\frac{1}{n})^n\\&amp;amp;=1+n(\frac{1}{n})+\frac{n(n-1)}{2!}(\frac{1}{n})^2+\cdots+\frac{n!}{n!}(\frac{1}{n})^n\\&amp;amp;=1+1+\frac{1}{2!}(1-\frac{1}{n})+\cdots+\frac{1}{n!}(1-\frac{1}{n})(1-\frac{2}{n})\cdots(1-\frac{n-1}{n})\\&amp;amp;\leq \sum_{k=0}^{n}\frac{1}{k!}\leq e\end{align}\] Next if \(n\ge m\), \[(1+\frac{1}{n})^n=1+1+\frac{1}{2!}(1-\frac{1}{n})+\cdots+\frac{1}{n!}(1-\frac{1}{n})(1-\frac{2}{n})\cdots(1-\frac{n-1}{n})\\\ge 1+1+\frac{1}{2!}(1-\frac{1}{n})+\cdots+\frac{1}{m!}(1-\frac{1}{n})(1-\frac{2}{n})\cdots(1-\frac{m-1}{n})\\\ge 1+1+\frac{1}{2!}+\cdots+\frac{1}{m!}\\=\sum_{k=0}^{m}\frac{1}{k!}\] let \(n\to\infty\), keep \(m\) fixed, we get \[\lim_{n\to\infty}\text{inf}(1+\frac{1}{n})^n\ge \sum_{k=0}^{m}\frac{1}{k!}\] when \(m\to\infty\) \[\lim_{n\to\infty}\text{inf}(1+\frac{1}{n})^n\ge e\] Then \[\lim_{n\to\infty}(1+\frac{1}{n})^n=e\]
For fixed rational number \(z\), \[\begin{align}\lim_{n\to\infty}(1+\frac{z}{n})^n&amp;amp;=\Bigl[\lim_{n\to\infty}(1+\frac{1}{n/z})^{n/z}\Bigr]^{z}\\&amp;amp;=e^z\end{align}\]
The Ratio Test for series convergent: The series \(\sum a_n\) converges if \[\lim_{n\to\infty}\text{sup}\Biggl|\frac{a_{n+1}}{a_n}\Biggr|&amp;lt;1\] and diverges if \[\Biggl|\frac{a_{n+1}}{a_n}\Biggr|\ge1\] for all \(n\ge n_0\) where \(n_0\) is some fixed integer.</description>
    </item>
    
    <item>
      <title>Power series</title>
      <link>/2020/11/18/power-series/</link>
      <pubDate>Wed, 18 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/18/power-series/</guid>
      <description>The power series are \[\sum_{n=0}^{\infty}c_nx^n\] The numbers \(c_n\) are called coefficients. \[R=\frac{1}{\displaystyle\lim_{n\to \infty}\text{sup}\sqrt[n]{|c_n|}}\] is called the radius of convergence of power series \[\sum_{n=0}^{\infty}c_nx^n\] \[\displaystyle\lim_{n\to \infty}\text{sup}\sqrt[n]{|c_nx^n|}=\frac{|x|}{R}\] Then \[\sum_{n=0}^{\infty}c_nx^n\] converges if \(|x|&amp;lt;R\) and diverges if \(|x|&amp;gt;R\).
Suppose the series \(\sum_{n=0}^{\infty}c_nx^n\) converges for \(|x|&amp;lt;R\), then it converges uniformly on \([-R+\varepsilon,R-\varepsilon], \varepsilon&amp;gt;0\). For \(|x|\leq R-\varepsilon\) we have \[|c_nx^n|\leq|c_n(R-\varepsilon)^n|\] and since \[\sum c_n(R-\varepsilon)^n\] converges absolutely (every power series converges absolutely in the interior of its radius), then there is an integer \(N\) that \[|\sum_{i=0}^{n}c_ix^i-\sum_{i=0}^{m}c_ix^i|=|\sum_{m+1}^{n}c_ix^i|&amp;lt;\varepsilon,\quad n\ge m\ge N\] then \(\sum_{n=0}^{\infty}c_nx^n\) converges uniformly.</description>
    </item>
    
    <item>
      <title>Sequences and Series of functions</title>
      <link>/2020/11/18/sequences-and-series-of-functions/</link>
      <pubDate>Wed, 18 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/18/sequences-and-series-of-functions/</guid>
      <description>A sequence of functions \(\{f_n\}, n=1,2,3,\cdots,\) defined on \(E\), and suppose the sequence of numbers \(\{f_n(x)\}\) converges for every \(x\in E\), we define the function \(f\) by \[f(x)=\lim_{n\to \infty}f_n(x)\quad (x\in E)\] We say that \(\{f_n\}\) converges to \(f\) pointwise on \(E\) and \(f\) is the limit function.
A sequence of functions \(\{f_n\}, n=1,2,3,\cdots,\) converges uniformly on \(E\) to a function \(f\) if for every \(\epsilon&amp;gt;0\) there is an integer \(N\) such that \(n&amp;gt;N\) implies \[|f_n(x)-f(x)|\le\epsilon\] for all \(x\in E\).</description>
    </item>
    
    <item>
      <title>L&#39;Hospital&#39;s rule and Taylor&#39;s theorem</title>
      <link>/2020/11/15/l-hospital-s-rule-and-taylor-s-theorem/</link>
      <pubDate>Sun, 15 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/15/l-hospital-s-rule-and-taylor-s-theorem/</guid>
      <description>If \(f\) be defined on \([a,b]\) and for \(x\in[a,b]\) the limit \[f&amp;#39;(x)=\underset{t\to x}{\lim}\frac{f(t)-f(x)}{t-x}\] exists, we say that \(f\) is differentiable at \(x\). If \(f&amp;#39;\) is defined at every point of a set \(E\subset[a,b]\), we say that \(f\) is differentiable on \(E\).
The mean value theorem If \(f\) and \(g\) are continuous real functions on \([a,b]\) which are differentiable in \((a,b)\), then there is a point \(x\in(a,b)\) at which \[\frac{f(b)-f(a)}{g(b)-g(a)}=\frac{f&amp;#39;(x)}{g&amp;#39;(x)}\] Put \(h(t)=[f(b)-f(a)]g(t)-[g(b)-g(a)]f(t)\quad(a\le t\le b)\) then \(h\) is continuous on \([a,b]\) and differentiable in \((a,b)\), and \(h(a)=[f(b)-f(a)]g(a)-[g(b)-g(a)]f(a)=f(b)g(a)-g(b)f(a)=h(b)\) To prove this theorem, we have to show that \(h&amp;#39;(x)=0\) for some \(x\in(a,b)\).</description>
    </item>
    
    <item>
      <title>Riemann-Stieltjes integral</title>
      <link>/2020/11/15/riemann-stieltjes-integral/</link>
      <pubDate>Sun, 15 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/15/riemann-stieltjes-integral/</guid>
      <description>A partition P of interval \([a,b]\) is a finite set of points \(x_0,x_1,\cdots,x_n\), where \(a=x_0\le x_1\le \cdots\le x_n=b\) and \(\Delta x_i=x_i-x_{i-1}\quad(i=1,\cdots,n)\). Let \[M_i=\text{sup }f(x)\quad(x_{i-1}\le x\le x_i)\] \[m_i=\text{inf }f(x)\quad(x_{i-1}\le x\le x_i)\] Corresponding to each partition \(P\) of \([a,b]\). We put \[U(P,f)=\sum_{i=1}^{n}M_i\Delta x_i\] \[L(P,f)=\sum_{i=1}^{n}m_i\Delta x_i\] \[\overline{\int}_{a}^{b} f dx=\text{inf}\sum_{i=1}^{n}M_i\Delta x_i\] \[\underline{\int}_{a}^{b} f dx=\text{sup}\sum_{i=1}^{n}m_i\Delta x_i\]which are called upper and lower Riemann integrals of \(f\) over \([a,b]\), respectively. If the upper and lower integrals are equal \[\overline{\int}_{a}^{b} f dx=\underline{\int}_{a}^{b} f dx\] we say that \(f\) is Riemann-integrable on \([a,b]\), we write \(f\in\mathscr R\).</description>
    </item>
    
    <item>
      <title>Continuity</title>
      <link>/2020/11/12/continuity/</link>
      <pubDate>Thu, 12 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/12/continuity/</guid>
      <description>If a set \(E\) in \(R^k\) is closed and bounded, then \(E\subset I\) for some compact k-cell \(I\), then \(E\) is the closed subset of compact set \(I\) and \(E\) is also compact. A bounded infinite set \(E\) in \(R^k\) is a subset of a compact k-cell \(I\), and \(E\) must have a limit point in \(I\) or \(R^k\). E is compact implies every infinite subset \(K\) of E has a limit point in E, and which implies E is closed and bounded.</description>
    </item>
    
    <item>
      <title>Numerical Sequences and Series</title>
      <link>/2020/11/10/numerical-sequences-and-series/</link>
      <pubDate>Tue, 10 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/10/numerical-sequences-and-series/</guid>
      <description>A sequence \(\{p_n\}\) in a metric space \(X\) is said to converge if there is a point \(p\in X\) with the following property: For every \(\epsilon&amp;gt;0\) there is an integer \(N\) such that \(n\ge N\) implies that \(d(p_n,p)&amp;lt;\epsilon\) and we also say \(\{p_n\}\) converges to \(p\) or \(p\) is the limit of \(\{p_n\}\) and we write \[\lim_{x\to \infty}p_n=p\] or \(p_n\to p\). A sequence \(\{p_n\}\) in a metric space \(X\) is said to be a Cauchy sequence if for every \(\epsilon&amp;gt;0\), there is an integer \(N\) such that \(d(p_n,p_m)&amp;lt;\epsilon\) if \(n\ge N\) and \(m\ge N\).</description>
    </item>
    
    <item>
      <title>Set theory</title>
      <link>/2020/11/06/set-theory/</link>
      <pubDate>Fri, 06 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/11/06/set-theory/</guid>
      <description>In metric space, a neighborhood of \(p\) is a set \(N_r(p)\) consisting of all \(q\) such that \(d(p,q)&amp;lt;r\), for some \(r&amp;gt;0\). The \(r\) is called radius of \(N_r(p)\). A point \(p\) is called a limit point of the set \(E\), if every neighborhood of \(p\) contains a point \(q\ne p\) and \(q\in E\). \(E\) is called closed if every limit point of \(E\) is a point of \(E\).</description>
    </item>
    
    <item>
      <title>Convex sets</title>
      <link>/2020/10/31/convex-sets/</link>
      <pubDate>Sat, 31 Oct 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/10/31/convex-sets/</guid>
      <description>A set \(C\subseteq\mathbf R^n\) is affine set if for any two distinct points lie in \(C\), \(x_1,x_2\in C\) and \(\theta\in \mathbf R\), the linear combination of these two points lies in \(C\), \(\theta x_1+(1-\theta)x_2\in C\) with the coefficients sum to one. This kind of linear combination is called affine combination. The set of all affine combinations of points in set \(C\subseteq\mathbf R^n\) is called the affine hull of \(C\), and denoted \[\mathbf{\text{aff }}C=\{\theta_1x_1+\cdots+\theta_kx_k|x_1,\cdots,x_k\in C,\theta_1+\cdots+\theta_k=1\}\] The affine hull is the smallest affine set that contains \(C\).</description>
    </item>
    
    <item>
      <title>Clustering</title>
      <link>/2020/10/29/clustering/</link>
      <pubDate>Thu, 29 Oct 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/10/29/clustering/</guid>
      <description>Hierarchical Clustering MethodsNonhierarchical Clustering MethodsCorrespondence Analysis
Matrix \(\mathbf X\), with elements \(x_{ij}\), is an two-way \((I\times J=n),i=1,2,\cdots,I;j=1,2,\cdots,J\) contingency table of unscaled frequencies or counts. The matrix of proportions \(\mathbf P=\{p_{ij}\}\) with elements \(p_{ij}=\frac{1}{n}x_{ij}\), is called the The correspondence matrix. The row sums are the vector \[\mathbf r=\{r_{i}=\sum_{j=1}^{J}p_{ij}=\sum_{j=1}^{J}\frac{1}{n}x_{ij}\}\] or \[\underset{(I\times 1)}{\mathbf r}=\underset{(I\times J)}{\mathbf P}\underset{(J\times1)}{\mathbf 1_J}\] The column sums are the vector \[\mathbf c=\{c_{j}=\sum_{i=1}^{I}p_{ij}=\sum_{i=1}^{I}\frac{1}{n}x_{ij}\}\] or \[\underset{(J\times 1)}{\mathbf c}=\underset{(J\times I)}{\mathbf P^T}\underset{(I\times1)}{\mathbf 1_I}\] Let diagonal matrix \[\mathbf D_r=diag(r_1,r_2,\cdots,r_I)\] \[\mathbf D_c=diag(c_1,c_2,\cdots,c_J)\] Correspondence analysis can be formulated as the weighted least squares problem to select matrix \(\hat{\mathbf P}=\{\hat{p}_{ij}\}\), which is specified reduced rank and can minimize the sum of squares \[\sum_{i=1}^{I}\sum_{j=1}^{J}\frac{(p_{ij}-\hat{p}_{ij})^2}{r_ic_j}=tr\Bigl[(\mathbf D_r^{-1/2}(\mathbf P-\hat{\mathbf P})\mathbf D_c^{-1/2})(\mathbf D_r^{-1/2}(\mathbf P-\hat{\mathbf P})\mathbf D_c^{-1/2})^T\Bigr]\] since \((p_{ij}-\hat{p}_{ij})/\sqrt{r_ic_j}\) is the \((i,j)\) element of \(\mathbf D_r^{-1/2}(\mathbf P-\hat{\mathbf P})\mathbf D_c^{-1/2}\) The scaled version of the correspondence matrix \(\mathbf P=\{p_{ij}\}\) is \[\mathbf B=\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}\] the best low \(\text{rank}=s\) approximation \(\hat{\mathbf B}\) to \(\mathbf B\) is given by the first \(s\) terms in the the singular-value decomposition \[\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}=\sum_{k=1}^{J}\widetilde{\lambda}_k\widetilde{\mathbf u}_k\widetilde{\mathbf v}_k^T\] where \[\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}\widetilde{\mathbf v}_k=\widetilde{\lambda}_k\widetilde{\mathbf u}_k\] and \[\widetilde{\mathbf u}_k^T\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}=\widetilde{\lambda}_k\widetilde{\mathbf v}_k^T\] Then the approximation to \(\mathbf P\) is then given by \[\hat{\mathbf P}=\mathbf D_r^{1/2}\hat{\mathbf B}\mathbf D_c^{1/2}\approx\sum_{k=1}^{s}\widetilde{\lambda}_k(\mathbf D_r^{1/2}\widetilde{\mathbf u}_k)(\mathbf D_c^{1/2}\widetilde{\mathbf v}_k)^T\] and the error of approximation is \[\sum_{k=s+1}^{J}\widetilde{\lambda}_k^2\] The term \(\mathbf r\mathbf c^T\) always provides the best rank one approximation to the correspondence matrix \(\mathbf P\), this corresponds to the assumption of independence of the rows and columns.</description>
    </item>
    
    <item>
      <title>Classification</title>
      <link>/2020/10/22/classification/</link>
      <pubDate>Thu, 22 Oct 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/10/22/classification/</guid>
      <description>Two classes \(\pi_1\) and \(\pi_2\) have the prior probability \(p_1\) and \(p_2\) separately and \(p_1+p_2=1\). The probabilities of the random variable \(\mathbf x\) belong to the 2 classes follow the density function \(f_1(\mathbf x)\) and \(f_2(\mathbf x)\) over the region \(R_1+R_2\) and \(\underset{R_1}{\int} f_1(\mathbf x)dx=P(1|1)\), \(\underset{R_2}{\int} f_1(\mathbf x)dx=P(2|1)\), \(\underset{R_2}{\int} f_2(\mathbf x)dx=P(2|2)\), \(\underset{R_1}{\int} f_2(\mathbf x)dx=P(1|2)\). Then the probability of observation \(\mathbf x\) which comes from class \(\pi_1\) and is correctly classified as \(\pi_1\) is the conditional probability \[P(\mathbf x\in R_1|\pi_1)P(\pi_1)=P(1|1)p_1\], and observation \(\mathbf x\) is misclassified as \(\pi_1\) is \[P(\mathbf x\in R_1|\pi_2)P(\pi_2)=P(1|2)p_2\] Similarly, observation is correctly classified as \(\pi_2\) is the conditional probability \[P(\mathbf x\in R_2|\pi_2)P(\pi_2)=P(2|2)p_2\], and observation is misclassified as \(\pi_2\) is \[P(\mathbf x\in R_2|\pi_1)P(\pi_1)=P(2|1)p_1\] The costs of misclassification can be defined by a cost matrix \[\begin{array}{cc|cc}&amp;amp;&amp;amp;\text{Classify as:}\\&amp;amp;&amp;amp;\pi_1&amp;amp;\pi_2\\\hline\\\text{True populations:}&amp;amp;\pi_1&amp;amp;0&amp;amp;c(2|1)\\&amp;amp;\pi_2&amp;amp;c(1|2)&amp;amp;0\\\end{array}\] Then the Expected Cost of Misclassification (ECM) is provided by \[\begin{bmatrix}P(2|1)&amp;amp;P(1|2)\\\end{bmatrix}\begin{bmatrix}0&amp;amp;c(2|1)\\c(1|2)&amp;amp;0\\\end{bmatrix}\begin{bmatrix}p_2\\p_1\\\end{bmatrix}=P(1|2)c(1|2)p_2+P(2|1)c(2|1)p_1\] A reasonable classification rule should have an ECM as small as possible.</description>
    </item>
    
    <item>
      <title>Correlation Analysis</title>
      <link>/2020/10/18/correlation-analysis/</link>
      <pubDate>Sun, 18 Oct 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/10/18/correlation-analysis/</guid>
      <description>Canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two groups of variables \(\mathbf X\) has \(p\) variables \[\mathbf X=\begin{bmatrix}X_1\\X_2\\\vdots\\X_p\\\end{bmatrix}\] and \(\mathbf Y\) has \(q\) variables \[\mathbf Y=\begin{bmatrix}Y_1\\Y_2\\\vdots\\Y_q\\\end{bmatrix}\] \[E(\mathbf X)=\boldsymbol\mu_X\] \[Cov(\mathbf X)=\boldsymbol\Sigma_{XX}\] and \[E(\mathbf Y)=\boldsymbol\mu_Y\] \[Cov(\mathbf Y)=\boldsymbol\Sigma_{YY}\] and \[Cov(\mathbf X,\mathbf Y)=\boldsymbol\Sigma_{XY}=\boldsymbol\Sigma_{YX}^T=E(\mathbf X-\boldsymbol\mu_X)(\mathbf Y-\boldsymbol\mu_Y)^T=\begin{bmatrix}\sigma_{X_1Y_1}&amp;amp;\sigma_{X_1Y_2}&amp;amp;\cdots&amp;amp;\sigma_{X_1Y_q}\\\sigma_{X_2Y_1}&amp;amp;\sigma_{X_2Y_2}&amp;amp;\cdots&amp;amp;\sigma_{X_2Y_q}\\\vdots&amp;amp;\vdots&amp;amp;\ddots&amp;amp;\vdots\\\sigma_{X_pY_1}&amp;amp;\sigma_{X_pY_2}&amp;amp;\cdots&amp;amp;\sigma_{X_pY_q}\\\end{bmatrix}\] Linear combinations provide simple summary measures of a set of variables.</description>
    </item>
    
    <item>
      <title>Factor analysis</title>
      <link>/2020/10/11/factor-analysis/</link>
      <pubDate>Sun, 11 Oct 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/10/11/factor-analysis/</guid>
      <description>Let \(\mathbf X\) is drawn from a \(p\)-variate normal distribution with \(N_p(\boldsymbol\mu, \boldsymbol\Sigma)\) distribution. The matrix of factor loadings \[\mathbf L=\begin{bmatrix}\ell_{11}&amp;amp;\ell_{12}&amp;amp;\cdots&amp;amp;\ell_{1m}\\\ell_{21}&amp;amp;\ell_{22}&amp;amp;\cdots&amp;amp;\ell_{2m}\\\vdots&amp;amp;\vdots&amp;amp;\ddots&amp;amp;\vdots\\\ell_{p1}&amp;amp;\ell_{p2}&amp;amp;\cdots&amp;amp;\ell_{pm}\\\end{bmatrix}\] with \(\ell_{ij}\) is the loading of the \(i^{th}\) variable on the \(j^{th}\) factor.
The common factor is \[\mathbf F=\begin{bmatrix}F_1\\F_2\\\vdots\\F_m\\\end{bmatrix}\] with \(E(\mathbf F)=\underset{(m\times 1)}{\mathbf0}\), \(Var(F_j)=1,\quad (j=1,2,\cdots,m)\) and \(Cov(\mathbf F)=E(\mathbf F\mathbf F^T)=\underset{(m\times m)}{\mathbf I}\) Then the Orthogonal factor model is \[\underset{(p\times1)}{\mathbf X-\boldsymbol\mu}=\underset{(p\times m)}{\mathbf L}\underset{(m\times1)}{\mathbf F}+\underset{(p\times1)}{\boldsymbol\epsilon}\] with \(E(\boldsymbol\epsilon)=\underset{(p\times 1)}{\mathbf0}\) and \[Cov(\boldsymbol\epsilon)=E(\boldsymbol\epsilon\boldsymbol\epsilon^T)=\boldsymbol\Psi=\begin{bmatrix}\psi_1&amp;amp;0&amp;amp;\cdots&amp;amp;0\\0&amp;amp;\psi_2&amp;amp;\cdots&amp;amp;0\\\vdots&amp;amp;\vdots&amp;amp;\ddots&amp;amp;\vdots\\0&amp;amp;0&amp;amp;\cdots&amp;amp;\psi_p\\\end{bmatrix}\] with \(Var(\epsilon_i)=\psi_i\) and \(\mathbf F\) and \(\boldsymbol\epsilon\) are independent with \(Cov(\boldsymbol\epsilon,\mathbf F)=E(\boldsymbol\epsilon\mathbf F^T)=\underset{(p\times m)}{\mathbf0}\)Because \(\mathbf L\) is fixed, then \[\begin{align}\boldsymbol\Sigma=Cov(\mathbf X)&amp;amp;=E(\mathbf X-\boldsymbol\mu)(\mathbf X-\boldsymbol\mu)^T\\&amp;amp;=E(\mathbf L\mathbf F+\boldsymbol\epsilon)(\mathbf L\mathbf F+\boldsymbol\epsilon)^T\\&amp;amp;=E(\mathbf L\mathbf F+\boldsymbol\epsilon)((\mathbf L\mathbf F)^T+\boldsymbol\epsilon^T)\\&amp;amp;=E\Bigl(\mathbf L\mathbf F(\mathbf L\mathbf F)^T+\boldsymbol\epsilon(\mathbf L\mathbf F)^T+\mathbf L\mathbf F\boldsymbol\epsilon^T+\boldsymbol\epsilon\boldsymbol\epsilon^T\Bigr)\\&amp;amp;=\mathbf LE(\mathbf F\mathbf F^T)\mathbf L^T+\mathbf0+\mathbf0+E(\boldsymbol\epsilon\boldsymbol\epsilon^T)\\&amp;amp;=\mathbf L\mathbf L^T+\boldsymbol\Psi\end{align}\] or \[Var(X_i)=\underset{Var(X_i)}{\underbrace{\sigma_{ii}}}=\mathbf L_i\mathbf L_i^T+\psi_i=\underset{\text{communality}}{\underbrace{\ell_{i1}^2+\ell_{i2}^2+\cdots+\ell_{im}^2}}+\underset{\text{specific variance}}{\underbrace{\psi_i}}\] with \(\mathbf L_i\) is the \(i^{th}\) row of \(\mathbf L\) We can denote the \(i^{th}\) communality as \(h_i^2=\ell_{i1}^2+\ell_{i2}^2+\cdots+\ell_{im}^2,\quad (i=1,2,\cdots,p)\), which is the sum of squares of the loadings of the \(i^{th}\) variable on the \(m\) common factors, and the total variance of the \(i^{th}\) variable is the sum of communality and specific variance \(\sigma_{ii}=h_i^2+\psi_i\)\[Cov(X_i,X_k)=E(\mathbf L_i^T\mathbf F+\epsilon_i)(\mathbf L_k^T\mathbf F+\epsilon_k)^T=\mathbf L_i^T\mathbf L_k=\ell_{i1}\ell_{k1}+\ell_{i2}\ell_{k2}+\cdots+\ell_{im}\ell_{km}\]\[Cov(\mathbf X,\mathbf F)=E(\mathbf X-\boldsymbol\mu)\mathbf F^T=E(\mathbf L\mathbf F+\boldsymbol\epsilon)\mathbf F^T=\mathbf LE(\mathbf F\mathbf F^T)+E(\boldsymbol\epsilon\mathbf F^T)=\mathbf L\] or \[Cov(X_i,F_j)=E(X_i-\mu_i)\mathbf F_j^T=E(\mathbf L_i^T\mathbf F+\epsilon_i)\mathbf F_j^T=\ell_{ij}\]</description>
    </item>
    
    <item>
      <title>Principal Component Analysis</title>
      <link>/2020/10/07/principal-component-analysis/</link>
      <pubDate>Wed, 07 Oct 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/10/07/principal-component-analysis/</guid>
      <description>Let the random vector \(\mathbf X^T=[X_1,X_2,\cdots,X_p]\) have the covariance matrix \(\boldsymbol\Sigma\) with eigenvalues \(\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0\), the linear combinations \(Y_i=\mathbf a_i^T\mathbf X=a_{i1}X_1+a_{i2}X_2+\cdots+a_{ip}X_p, \quad (i=1,2,\cdots,p)\) has \(Var(Y_i)=Var(\mathbf a_i^T\mathbf X)=\mathbf a_i^TCov(\mathbf X)\mathbf a_i=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i\) and \(Cov(Y_i,Y_k)=Cov(\mathbf a_i^T\mathbf X, \mathbf a_k^T\mathbf X)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_k \quad i,k=1,2,\cdots,p\). The principal components are those uncorrelated linear combinations of \([X_1,X_2,\cdots,X_p]\), \(Y_1,Y_2,\cdots,Y_p\) whose variances \(Var(Y_i)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i\) are as large as possible, subject to \(\mathbf a_i^T\mathbf a_i=1\). These linear combinations represent the selection of a new coordinate system obtained by rotating the original system with \(Y_1,Y_2,\cdots,Y_p\) as the new coordinate axes.</description>
    </item>
    
    <item>
      <title>Comparisons of several means</title>
      <link>/2020/09/29/comparisons-of-several-means/</link>
      <pubDate>Tue, 29 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/29/comparisons-of-several-means/</guid>
      <description>Paired Comparisons:
If there are \(2\) treatments over multivariate \(\mathbf x_p\), the difference between treatment \(1\) and treatment \(2\) is \(\mathbf d_j=\mathbf x_{j1}-\mathbf x_{j2},\quad j=1,2,\cdots,n\) if \(\mathbf d_j\) are independent \(N_p(\boldsymbol\delta, \mathbf\Sigma_d)\) random vectors, inferencesabout the vector of mean differences \(\boldsymbol\delta\) can be based upon a \(T^2\)-statistic: \(T^2=n(\overline{\mathbf d}-\boldsymbol\delta)^T\mathbf S_d^{-1}(\overline{\mathbf d}-\boldsymbol\delta)\) is distributed as an \(\frac{(n-1)p}{n-p}F_{p,n-p}\) random variable, where \(\overline{\mathbf d}=\displaystyle\frac{1}{n}\displaystyle\sum_{j=1}^{n}\mathbf d_j\) and \(\mathbf S_d=\displaystyle\frac{1}{n-1}\displaystyle\sum_{j=1}^{n}(\mathbf d_j-\overline{\mathbf d})(\mathbf d_j-\overline{\mathbf d})^T\), then an \(\alpha\)-level hypothesis test of \(H_0:\boldsymbol\delta=\mathbf 0\) versus \(H_1:\boldsymbol\delta\ne\mathbf 0\), rejects \(H_0\) if the observed \(T^2=n\overline{\mathbf d}^T\mathbf S_d^{-1}\overline{\mathbf d}&amp;gt;\frac{(n-1)p}{n-p}F_{p,n-p}(\alpha)\).</description>
    </item>
    
    <item>
      <title>Inferences about the mean</title>
      <link>/2020/09/25/inferences-about-the-mean/</link>
      <pubDate>Fri, 25 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/25/inferences-about-the-mean/</guid>
      <description>The hypothesis testing about the mean is a test of the competing hypotheses: \(H_0:\mu=\mu_0\) and \(H_1:\mu\ne\mu_0\). If \(X_1,X_2,\cdots,X_n\) denote a random sample from a normal population, the appropriate test statistic is \(t=\frac{(\overline X-\mu_0)}{s/\sqrt{n}}\) with \(s^2=\frac{1}{(n-1)}\displaystyle\sum_{i=1}^{n}(X_i-\overline X)^2\). Rejecting \(H_0\) when \(|t|\) is large is equivalent to rejecting \(H_0\) when \(t^2=\frac{(\overline X-\mu_0)^2}{s^2/n}=n(\overline X-\mu_0)(s^2)^{-1}(\overline X-\mu_0)\) is large. Then the test becomes reject \(H_0\) in favor of \(H_1\) at significance level \(\alpha\) if \(n(\overline X-\mu_0)(s^2)^{-1}(\overline X-\mu_0)&amp;gt;t_{n-1}^2(\alpha/2)\), its multivariate analog is \(T^2=(\overline {\mathbf X}-\boldsymbol\mu_0)^T(\frac{1}{n}\mathbf S)^{-1}(\overline {\mathbf X}-\boldsymbol\mu_0)=n(\overline {\mathbf X}-\boldsymbol\mu_0)^T\mathbf S^{-1}(\overline {\mathbf X}-\boldsymbol\mu_0)\), where \(\overline {\mathbf X}=\frac{1}{n}\displaystyle\sum_{j=1}^{n}\mathbf X_j\), \(\underset{(p\times p)}{\mathbf S}=\frac{1}{n-1}\displaystyle\sum_{j=1}^{n}(\underset{(p\times 1)}{\mathbf X_j}-\underset{(p\times 1)}{\overline {\mathbf X}})(\underset{(p\times 1)}{\mathbf X_j}-\underset{(p\times 1)}{\overline {\mathbf X}})^T\)</description>
    </item>
    
    <item>
      <title>The Multivariate Normal Density</title>
      <link>/2020/09/11/the-multivariate-normal-density/</link>
      <pubDate>Fri, 11 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/11/the-multivariate-normal-density/</guid>
      <description>The univariate normal pdf is:\[f_X(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}, \quad -\infty&amp;lt;x&amp;lt;+\infty\] The term \((\frac{x-\mu}{\sigma})^2=(x-\mu)(\sigma^2)^{-1}(x-\mu)\) measures the square ofthe univariate distance from \(x\) to \(\mu\) in standard deviation units. This can be generalized to a \(p\times 1\) vector \(\mathbf x\) of observations on several variables as \((\mathbf X-\boldsymbol \mu)^T(\mathbf \Sigma)^{-1}(\mathbf X-\boldsymbol \mu)\), which is the square of the multivariate generalized distance from \(\mathbf X\) to \(\boldsymbol \mu\), the \(p\times p\) matrix \(\mathbf \Sigma\) is the variance–covariance matrix of \(\mathbf X\).</description>
    </item>
    
    <item>
      <title>The Bivariate Normal Distribution</title>
      <link>/2020/09/10/the-bivariate-normal-distribution/</link>
      <pubDate>Thu, 10 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/10/the-bivariate-normal-distribution/</guid>
      <description>The univariate normal pdf is:\[f_Y(y)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{y-\mu}{\sigma})^2}, \quad -\infty&amp;lt;y&amp;lt;+\infty\]
The bivariate normal pdf is, \[f_{X,Y}(x, y)=Ke^{-\frac{1}{2}c(x^2-2\nu xy+y^2)}, \quad -\infty&amp;lt;x, y&amp;lt;+\infty\] where \(c\) and \(\nu\) are constants.\[\begin{align}f_{X,Y}(x, y)&amp;amp;=Ke^{-\frac{1}{2}c(x^2-2\nu xy+y^2)}\\&amp;amp;=Ke^{-\frac{1}{2}c(x^2-\nu^2x^2+\nu^2x^2-2\nu xy+y^2)}\\&amp;amp;=Ke^{-\frac{1}{2}c(x^2-\nu^2x^2)+(\nu x-y)^2}\\&amp;amp;=Ke^{-\frac{1}{2}cx^2(1-\nu^2)}e^{-\frac{1}{2}c(\nu x-y)^2}\\\end{align}\] The exponents must be negative, so \(1-\nu^2&amp;gt;0\).
\[\begin{align}\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}f_{X,Y}(x, y)dxdy&amp;amp;=\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}Ke^{-\frac{1}{2}cx^2(1-\nu^2)}e^{-\frac{1}{2}c(\nu x-y)^2}dxdy\\&amp;amp;=K\int_{-\infty}^{+\infty}e^{-\frac{1}{2}cx^2(1-\nu^2)} \Biggl[\int_{-\infty}^{+\infty}e^{-\frac{1}{2}c(y-\nu x)^2}dy\Biggr]dx\\&amp;amp;=K\int_{-\infty}^{+\infty}e^{-\frac{1}{2}cx^2(1-\nu^2)}\frac{\sqrt{2\pi}}{\sqrt{c}}dx\\&amp;amp;=K\frac{\sqrt{2\pi}}{\sqrt{c}}\frac{\sqrt{2\pi}}{\sqrt{c(1-\nu^2)}}\\&amp;amp;=K\frac{2\pi}{c\sqrt{1-\nu^2}}\\&amp;amp;=1\end{align}\] Then \(K=\frac{c\sqrt{1-\nu^2}}{2\pi}\), if we choose \(c=\frac{1}{1-\nu^2}\), then \(K=\frac{1}{2\pi\sqrt{1-\nu^2}}\) and\[\begin{align}f_{X,Y}(x, y)&amp;amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(x^2-2\nu xy+y^2)}\\&amp;amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(x^2-\nu^2x^2+\nu^2x^2-2\nu xy+y^2)}\\&amp;amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}\end{align}\] The marginal pdfs are sure the standard normal:\[\begin{align}f_{X}(x)&amp;amp;=\int_{-\infty}^{+\infty}f_{X,Y}(x, y)dy\\&amp;amp;=\int_{-\infty}^{+\infty}\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dy\\&amp;amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}\int_{-\infty}^{+\infty}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dy\\&amp;amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}\sqrt{2\pi}\sqrt{1-\nu^2}\\&amp;amp;=\frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}x^2}\end{align}\] and \(E(X)=E(Y)=0\) and \(\sigma_X=\sigma_Y=1\), then the correlation coefficient between X and Y is:\[\begin{align}\rho(X,Y)&amp;amp;=\frac{Cov(X,Y)}{\sigma_X\sigma_Y}\\&amp;amp;=\frac{E(XY) − E(X)E(Y)}{\sigma_X\sigma_Y}\\&amp;amp;=E(XY)\\&amp;amp;=\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}xyf_{X,Y}(x, y)dxdy\\&amp;amp;=\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}xy\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dxdy\\&amp;amp;=\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}xe^{-\frac{1}{2}x^2} \Biggl[\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi(1-\nu^2)}}ye^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dy\Biggr]dx\\&amp;amp;=\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}xe^{-\frac{1}{2}x^2}\nu x dx\\&amp;amp;=\nu\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}x^2e^{-\frac{1}{2}x^2}dx\\&amp;amp;=\nu Var(X)\\&amp;amp;=\nu \sigma_X\\&amp;amp;=\nu\end{align}\] So \(\nu\) is the correlation coefficient between X and Y.</description>
    </item>
    
    <item>
      <title>Randomized block design</title>
      <link>/2020/09/09/randomized-block-design/</link>
      <pubDate>Wed, 09 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/09/randomized-block-design/</guid>
      <description>In the Randomized block design, all of the sample sizes are the same \(b\), which is the blocks, the mathematical model associated with \(Y_{ij}\) is :\(Y_{ij}=\mu_j+\beta_i+\epsilon_{ij}\), the term \(\beta_i\) represents the effect of the \(i^{th}\) block.\[\begin{array}{|cc|cccc ccc|}\hline&amp;amp;&amp;amp;&amp;amp;\text{treatment}&amp;amp;\text{levels}&amp;amp; &amp;amp; &amp;amp; Block&amp;amp;Block&amp;amp; Block\\&amp;amp; &amp;amp; 1 &amp;amp; 2 &amp;amp; \cdots &amp;amp; k &amp;amp;&amp;amp; Totals &amp;amp; Means &amp;amp; Effects \\\hline&amp;amp;1&amp;amp; Y_{11} &amp;amp; Y_{12} &amp;amp; \cdots &amp;amp; Y_{1k} &amp;amp;&amp;amp; T_{1.</description>
    </item>
    
    <item>
      <title>Testing Subhypotheses with Contrasts</title>
      <link>/2020/09/07/testing-subhypotheses-with-contrasts/</link>
      <pubDate>Mon, 07 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/07/testing-subhypotheses-with-contrasts/</guid>
      <description>A linear combination of the true means of \(k\) factor levels \(\mu_1,\mu_2,\cdots,\mu_k\) of the randomized-one-factor-design \(C=\displaystyle\sum_{j=1}^{k}c_j\mu_j\) is said to be a contrast if the sum of its coefficients \(\displaystyle\sum_{j=1}^{k}c_j=0\). Because \(\overline Y_{.j}\) is always an unbiased estimator for \(\mu_j\), we can use it to estimate C \(\hat C=\displaystyle\sum_{j=1}^{k}c_j\overline Y_{.j}\). Because \(Y_{ij}\) are normal, so \(\hat C\) is also normal.Then, \(E(\hat C)=\displaystyle\sum_{j=1}^{k}c_jE(\overline Y_{.j})=\displaystyle\sum_{j=1}^{k}c_j\mu_j=C\) and \(Var(\hat C)=\displaystyle\sum_{j=1}^{k}c_j^2Var(\overline Y_{.j})=\displaystyle\sum_{j=1}^{k}c_j^2\frac{\sigma^2}{n_j}=\sigma^2\displaystyle\sum_{j=1}^{k}\frac{c_j^2}{n_j}\). Replacing \(\sigma^2\) by its estimate \(MSE\) gives a formula for the estimated variance \(S_{\hat C}^2=MSE\displaystyle\sum_{j=1}^{k}\frac{c_j^2}{n_j}\).</description>
    </item>
    
    <item>
      <title>Randomized one-factor design and the analysis of variance (ANOVA)</title>
      <link>/2020/09/06/randomized-one-factor-design-and-the-analysis-of-variance-anova/</link>
      <pubDate>Sun, 06 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/06/randomized-one-factor-design-and-the-analysis-of-variance-anova/</guid>
      <description>If we want to compare the average effects elicited by \(k\) different levels of some given factor, there will be \(k\) independent random samples of sizes \(n_j\quad (j=1,2,...,k)\), the total sample size is \(n=\displaystyle\sum_{j=1}^{k}n_j\). Let \(Y_{ij}\) represent the \(i^{th}\) observation recorded for the \(j^{th}\) level.\[\begin{array}{|c|cccc|}\hline&amp;amp;&amp;amp;\text{treatment}&amp;amp;\text{levels}&amp;amp;\\\hline&amp;amp; 1 &amp;amp; 2 &amp;amp; \cdots &amp;amp; k \\\hline&amp;amp; Y_{11} &amp;amp; Y_{12} &amp;amp; \cdots &amp;amp; Y_{1k} \\&amp;amp; Y_{21} &amp;amp; Y_{22} &amp;amp; \cdots &amp;amp; Y_{2k} \\&amp;amp;\vdots &amp;amp;\vdots &amp;amp;\cdots&amp;amp;\vdots \\&amp;amp;Y_{n_11} &amp;amp;Y_{n_22} &amp;amp;\cdots&amp;amp;Y_{n_kk} \\\text{Sample sizes:}&amp;amp;n_1&amp;amp;n_2&amp;amp;\cdots&amp;amp;n_k\\\text{Sample totals:}&amp;amp;T_{.</description>
    </item>
    
    <item>
      <title>covariance and correlation coefficient</title>
      <link>/2020/09/05/covariance-and-correlation-coefficient/</link>
      <pubDate>Sat, 05 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/05/covariance-and-correlation-coefficient/</guid>
      <description>We define the covariance of any two random variables \(X\) and \(Y\), written \(Cov(X,Y)\), as: \[\begin{align}Cov(X,Y) &amp;amp;= E(X-\mu_X)(Y-\mu_Y)\\&amp;amp;= E(XY-X\mu_Y-Y\mu_X+\mu_X\mu_Y)\\&amp;amp;= E(XY)-\mu_X\mu_Y-\mu_X\mu_Y+\mu_X\mu_Y\\&amp;amp;= E(XY) - \mu_X\mu_Y\\&amp;amp;= E(XY) − E(X)E(Y)\\\end{align}\].If \(X\) and \(Y\) are independent random variables,\[\begin{align}E(XY)&amp;amp;=\int\int xy\cdot f_{X,Y}(x,y)dxdy\\&amp;amp;=\int\int xy\cdot f_X(x)f_Y(y)dxdy\\&amp;amp;=\int x\cdot f_X(x)dx\int y\cdot f_Y(y)dy\\&amp;amp;=E(X)E(Y)\end{align}\], then \(Cov(X,Y) = E(XY) − E(X)E(Y)=0\)
The Variance of the sum of two random variables \(aX + bY\) is:\[\begin{align}Var(aX + bY) &amp;amp;= E(aX + bY)^2-(E(aX + bY))^2\\&amp;amp;=E(aX + bY)^2-(a\mu_X+b\mu_Y)^2\\&amp;amp;=E(a^2X^2+2aXbY+b^2Y^2)-a^2\mu_X^2-2a\mu_Xb\mu_Y-b^2\mu_Y^2\\&amp;amp;=a^2(E(X^2)-\mu_X^2)+b^2(E(Y^2)-\mu_Y^2)+2ab(E(XY)-\mu_X\mu_Y)\\&amp;amp;=a^2Var(X)+b^2Var(Y)+2abCov(X,Y)\end{align}\].</description>
    </item>
    
    <item>
      <title>Regression random variable Y for a given value x</title>
      <link>/2020/09/04/regression-random-variable-y-for-a-given-value-x/</link>
      <pubDate>Fri, 04 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/04/regression-random-variable-y-for-a-given-value-x/</guid>
      <description>We want to make regression of a random variable \(Y\) for a given value \(x\), the function \(f_{Y|x}(y)\) denotes the pdf of the random variable \(Y\) for a given value \(x\), and the expected value associated with \(f_{Y|x}(y)\) is \(E(Y | x)\). The function\(y = E(Y | x)\) is called the regression curve of \(Y\) on \(x\). The regression model is called simple linear model if it satisfy the \(4\) assumptions:</description>
    </item>
    
    <item>
      <title>Linear Regression</title>
      <link>/2020/09/03/linear-regression/</link>
      <pubDate>Thu, 03 Sep 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/09/03/linear-regression/</guid>
      <description>If there are \(n\) points \((x_1,y_1),(x_2,y_3),...,(x_n,y_n)\), the straight line \(y=a+bx\) minimizing the sum of the squares of the vertical distances from the data points to the line \(L=\sum_{i=1}^{n}(y_i-a-bx_i)^2\), then we take partial derivatives of L with respect to \(a\) and \(b\) and let them equal to \(0\) to get least squares coefficients \(a\) and \(b\):\[\frac{\partial L}{\partial b}=-2\sum_{i=1}^{n}(y_i-a-bx_i)x_i=0\], then \[\sum_{i=1}^{n}x_iy_i=a\sum_{i=1}^{n}x_i+b\sum_{i=1}^{n}x_i^2\]
And, \[\frac{\partial L}{\partial a}=-2\sum_{i=1}^{n}(y_i-a-bx_i)=0\], then\[\sum_{i=1}^{n}y_i=na+b\sum_{i=1}^{n}x_i\]these 2 equations are:\[\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_i &amp;amp; \displaystyle\sum_{i=1}^{n}x_i^2\\n &amp;amp; \displaystyle\sum_{i=1}^{n}x_i\\\end{bmatrix}\begin{bmatrix}a\\b\end{bmatrix}=\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_iy_i\\\displaystyle\sum_{i=1}^{n}y_i\end{bmatrix}\]then, using Cramer’s rule\[\begin{align}b&amp;amp;=\frac{\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_i &amp;amp; \displaystyle\sum_{i=1}^{n}x_iy_i\\n &amp;amp; \displaystyle\sum_{i=1}^{n}y_i\\\end{bmatrix}}{\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_i &amp;amp; \displaystyle\sum_{i=1}^{n}x_i^2\\n &amp;amp; \displaystyle\sum_{i=1}^{n}x_i\\\end{bmatrix}}\\&amp;amp;=\frac{(\displaystyle\sum_{i=1}^{n}x_i)(\displaystyle\sum_{i=1}^{n}y_i)-n(\displaystyle\sum_{i=1}^{n}x_iy_i)}{(\displaystyle\sum_{i=1}^{n}x_i)^2-n\displaystyle\sum_{i=1}^{n}x_i^2}\\&amp;amp;=\frac{n(\displaystyle\sum_{i=1}^{n}x_iy_i)-(\displaystyle\sum_{i=1}^{n}x_i)(\displaystyle\sum_{i=1}^{n}y_i)}{n\displaystyle\sum_{i=1}^{n}x_i^2-(\displaystyle\sum_{i=1}^{n}x_i)^2}\\&amp;amp;=\frac{(\displaystyle\sum_{i=1}^{n}x_iy_i)-\frac{1}{n}(\displaystyle\sum_{i=1}^{n}x_i)(\displaystyle\sum_{i=1}^{n}y_i)}{\displaystyle\sum_{i=1}^{n}x_i^2-\frac{1}{n}(\displaystyle\sum_{i=1}^{n}x_i)^2}\end{align}\], and, \(a=\frac{\displaystyle\sum_{i=1}^{n}y_i-b\sum_{i=1}^{n}x_i}{n}=\bar y-b\bar x\), which shows point \((\bar x, \bar y)\) is in the line.</description>
    </item>
    
    <item>
      <title>The square of Student t random variable is a F distribution with with 1 and n df</title>
      <link>/2020/08/30/the-square-of-student-t-random-variable-is-a-f-distribution-with-with-1-and-n-df/</link>
      <pubDate>Sun, 30 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/30/the-square-of-student-t-random-variable-is-a-f-distribution-with-with-1-and-n-df/</guid>
      <description>The Student t ratio with \(n\) degrees of freedom is denoted \(T_n\), where\(T_n=\frac{Z}{\sqrt{\frac{U}{n}}}\), \(Z\) is a standard normal random variable and \(U\) is a \(\chi^2\) random variable independent of \(Z\) with \(n\) degrees of freedom.
Because \(T_n^2= \frac{Z^2}{U/n}\) has an \(F\) distribution with \(1\) and \(n\) df, then,\[f_{T_n^2}(t)=\frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{1}{2})\Gamma(\frac{n}{2})}\frac{n^{\frac{n}{2}}t^{-\frac{1}{2}}}{(n+t)^{\frac{1+n}{2}}},\quad t&amp;gt;0\]
Then,\[\begin{align}f_{T_n}(t)&amp;amp;=\frac{d}{dt}F_{T_n}(t)\\&amp;amp;=\frac{d}{dt}P(T_n\le t)\\&amp;amp;=\frac{d}{dt}(\frac{1}{2}+P(0\le T_n\le t))\\&amp;amp;=\frac{d}{dt}(\frac{1}{2}+\frac{1}{2}P(-t\le T_n\le t))\quad (t&amp;gt;0)\\&amp;amp;=\frac{d}{dt}(\frac{1}{2}+\frac{1}{2}P(T_n^2\le t^2))\\&amp;amp;=\frac{d}{dt}(\frac{1}{2}+\frac{1}{2}F_{T_n^2}(t^2))\\&amp;amp;=t\cdot f_{T_n^2}(t^2)\\&amp;amp;=t\cdot \frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{1}{2})\Gamma(\frac{n}{2})}\frac{n^{\frac{n}{2}}t^{-1}}{(n+t^2)^{\frac{1+n}{2}}}\\&amp;amp;=\frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{1}{2})\Gamma(\frac{n}{2})}\frac{1}{\sqrt{n}}\frac{1}{(1+\frac{t^2}{n})^{\frac{1+n}{2}}}\\&amp;amp;=\frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{n}{2})}\frac{1}{\sqrt{n\pi}}\frac{1}{(1+\frac{t^2}{n})^{\frac{1+n}{2}}}\end{align}\]</description>
    </item>
    
    <item>
      <title>From the Cumulative distribution of standard normal distribution can get the chi square distribution </title>
      <link>/2020/08/29/from-the-cumulative-distribution-of-standard-normal-distribution-can-get-the-chi-square-distribution/</link>
      <pubDate>Sat, 29 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/29/from-the-cumulative-distribution-of-standard-normal-distribution-can-get-the-chi-square-distribution/</guid>
      <description>The Cumulative distribution function of standard Normal distribution in the region \((-x,x),x&amp;gt;0\) is:\[\begin{align}\Phi(x)&amp;amp;=\frac{1}{\sqrt{2\pi}}\int_{-x}^{x} e^{-\frac{1}{2}z^2}dz\\&amp;amp;=\frac{2}{\sqrt{2\pi}}\int_{0}^{x} e^{-\frac{1}{2}z^2}dz\\&amp;amp;=\frac{2}{\sqrt{2\pi}}\int_{0}^{\sqrt{x}} \frac{1}{2\sqrt{u}}e^{-\frac{1}{2}u}du \quad (u=z^2)\\&amp;amp;=\frac{1}{\sqrt{2\pi}}\int_{0}^{\sqrt{x}} \frac{1}{\sqrt{u}}e^{-\frac{1}{2}u}du\\&amp;amp;=\int_{0}^{\sqrt{x}}\frac{(\frac{1}{2})^{\frac{1}{2}}}{\Gamma(\frac{1}{2})}u^{(\frac{1}{2})-1}e^{-\frac{1}{2}u}du\end{align}\]
Here, the integrand\(f_U(u)=\frac{(\frac{1}{2})^{\frac{1}{2}}}{\Gamma(\frac{1}{2})}u^{(\frac{1}{2})-1}e^{-\frac{1}{2}u}\)is a special Gamma distribution with \(r=\frac{1}{2}, \lambda=\frac{1}{2}\). Here, the \(u=z^2\), where the \(z\) is independent standard normal random variable.
And the sum of several \(U=Z^2\) variables\[Y=\sum_{j=1}^{m} U_j=\sum_{j=1}^{m} Z_{j}^{2}\] is still a Gamma distribution:\[f_Y(y)=\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})}y^{(\frac{m}{2})-1}e^{-\frac{1}{2}y}\],and we give the special Gamma distribution with \(r=\frac{m}{2}, \lambda=\frac{1}{2}\) a new name: \(\chi^2\) distribution with \(m\) degrees of freedom.</description>
    </item>
    
    <item>
      <title>Ratio of 2 independent chi square random variables divided by their degrees of freedom is F distribution</title>
      <link>/2020/08/29/ratio-of-2-independent-chi-square-random-variables-divided-by-their-degrees-of-freedom-is-f-distribution/</link>
      <pubDate>Sat, 29 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/29/ratio-of-2-independent-chi-square-random-variables-divided-by-their-degrees-of-freedom-is-f-distribution/</guid>
      <description>When V and U are two \(\chi^2\) independent random variables: \(f_V(v)=\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})}v^{(\frac{m}{2})-1}e^{-\frac{1}{2}v}\)
\(f_U(u)=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}u^{(\frac{n}{2})-1}e^{-\frac{1}{2}u}\)
with \(m\) and \(n\) degrees of freedom, then, the pdf for \(W=V/U\) is:
\[\begin{align}f_{V/U}(\omega)&amp;amp;=\int_{0}^{+\infty}|u|f_U(u)f_V(u\omega)du\\&amp;amp;=\int_{0}^{+\infty}u\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}u^{\frac{n}{2}-1}e^{-\frac{1}{2}u} \frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})}(u\omega)^{\frac{m}{2}-1}e^{-\frac{1}{2}u\omega}du\\&amp;amp;=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})} \omega^{\frac{m}{2}-1} \int_{0}^{+\infty}u^{\frac{n}{2}}u^{\frac{m}{2}-1} e^{-\frac{1}{2}u(1+\omega)}du\\&amp;amp;=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})} \omega^{\frac{m}{2}-1} \int_{0}^{+\infty}u^{\frac{n+m}{2}-1} e^{-\frac{1}{2}u(1+\omega)}du\\&amp;amp;=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})} \omega^{\frac{m}{2}-1} (\frac{\Gamma(\frac{n+m}{2})}{(\frac{1}{2}(1+\omega))^{\frac{n+m}{2}}})\\&amp;amp;=\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{\omega^{\frac{m}{2}-1}}{(1+\omega)^{\frac{n+m}{2}}}\end{align}\]
Then, the pdf for \(W=\frac{V/m}{U/n}\) is:\[\begin{align}f_{\frac{V/m}{U/n}}&amp;amp;=f_{\frac{n}{m}V/U}\\&amp;amp;=\frac{m}{n}f_{V/U}(\frac{m}{n}\omega)\\&amp;amp;=\frac{m}{n}\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{(\frac{m}{n}\omega)^{\frac{m}{2}-1}}{(1+\frac{m}{n}\omega)^{\frac{n+m}{2}}}\\&amp;amp;=\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{m}{n}\frac{(\frac{m}{n}\omega)^{\frac{m}{2}-1}}{(n+m\omega)^{\frac{n+m}{2}}}n^{\frac{n+m}{2}}\\&amp;amp;=\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{m^{\frac{m}{2}}n^{\frac{n}{2}}\omega^{\frac{m}{2}-1}}{(n+m\omega)^{\frac{n+m}{2}}}\end{align}\], which is a \(F\) distribution with \(m\) and \(n\) degrees of freedom.</description>
    </item>
    
    <item>
      <title>Geometric Distribution is the first success occurs on kth Bernoulli trial, Negative Binomial is the rth success occurs on kth Bernoulli trial</title>
      <link>/2020/08/25/geometric-distribution-is-the-first-success-occurs-on-kth-bernoulli-trial-negative-binomial-is-the-rth-success-occurs-on-kth-bernoulli-trial/</link>
      <pubDate>Tue, 25 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/25/geometric-distribution-is-the-first-success-occurs-on-kth-bernoulli-trial-negative-binomial-is-the-rth-success-occurs-on-kth-bernoulli-trial/</guid>
      <description>The Geometric variable X has a pdf like this:\[P_X(k)=P(X=k)=(1-p)^{k-1}p, \quad k=1,2,3,..\]
The moment-generating function for a Geometric random variable X is:\[\begin{align}M_X(t)=E(e^{tX})&amp;amp;=\sum_{all\ k}e^{tk}(1-p)^{k-1}p\\&amp;amp;=\frac{p}{1-p}\sum_{all\ k}(e^t(1-p))^{k}\\&amp;amp;=\frac{p}{1-p}(\frac{1}{1-e^t(1-p)}-1)\\&amp;amp;=\frac{pe^t}{1-(1-p)e^t}\end{align}\]
The expected value is:\[\begin{align}M_X^{(1)}(t)&amp;amp;=\frac{d}{dt}\frac{pe^t}{1-(1-p)e^t}\\&amp;amp;=\frac{pe^t}{1-(1-p)e^t}+\frac{pe^t(1-p)e^t}{(1-(1-p)e^t)^2}\Bigl|_{t=0}\\&amp;amp;=1+\frac{p(1-p)}{p^2}\\&amp;amp;=\frac{1}{p}\end{align}\]
\[\begin{align}M_X^{(2)}(t)&amp;amp;=\frac{d}{dt}\Bigl(\frac{pe^t}{1-(1-p)e^t}+\frac{pe^t(1-p)e^t}{(1-(1-p)e^t)^2}\Bigr)\\&amp;amp;=\frac{pe^t}{1-(1-p)e^t}+\frac{pe^t(1-p)e^t}{(1-(1-p)e^t)^2}+\frac{2pe^{2t}(1-p)}{(1-(1-p)e^t)^2}+\frac{2pe^{3t}(1-p)^2}{(1-(1-p)e^t)^3}\Biggl|_{t=0}\\&amp;amp;=1+(1/p-1)+2(1/p-1)+2(1/p-1)^2\\&amp;amp;=2/p^2-1/p\end{align}\]
Then, the Variance is:\(Var(X)=E(X^2)-(E(X))^2=2/p^2-1/p-1/p^2=1/p^2-1/p=\frac{1-p}{p^2}\)
Negative Binomial is the rth success occurs on kth Bernoulli trialThe Negative Binomial variable Y has a pdf like this:\[P_Y(k)=P(Y=k)=\binom{k-1}{r-1}p^r(1-p)^{k-r}, \quad k=r,r+1,r+2,.</description>
    </item>
    
    <item>
      <title>Exponential distribution is interval between consecutive Poisson events</title>
      <link>/2020/08/24/exponential-distribution-is-interval-between-consecutive-poisson-events/</link>
      <pubDate>Mon, 24 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/24/exponential-distribution-is-interval-between-consecutive-poisson-events/</guid>
      <description>Let’s denote the interval between consecutive Poisson events with random variable Y, during the interval that extends from a to a + y, the number of Poisson events k has the probability \(P(k)=e^{-\lambda y} \frac{(\lambda y)^k}{k!}\), if \(k=0\),\(e^{-\lambda y}\frac{(\lambda y)^0}{0!}=e^{-\lambda y}\) means there is no event during the (a,a+y) time period.
Because there will be no occurrences in the interval (a, a + y) only if Y &amp;gt; y,so \(P(Y &amp;gt; y)=e^{-\lambda y}\), then the cdf is \(F_Y(y)=P(Y \le y)=1-P(Y &amp;gt; y)=1-e^{-\lambda y}\).</description>
    </item>
    
    <item>
      <title>Poisson is a limit of Binomial when n goes to infinity with np maintained</title>
      <link>/2020/08/24/poisson-is-a-limit-of-binomial-when-n-goes-to-infinity-with-np-maintained/</link>
      <pubDate>Mon, 24 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/24/poisson-is-a-limit-of-binomial-when-n-goes-to-infinity-with-np-maintained/</guid>
      <description>The binomial random variable has a pdf like this:\(P_X(k)=\binom{n}{k}p^k(1-p)^{n-k},\quad k=0,1,2,...,n\)Its moment-generating function is:\[\begin{align}M_X(t)=E(e^{tX})&amp;amp;=\sum_{k=0}^{n}e^{tk}\binom{n}{k}p^k(1-p)^{n-k}\\&amp;amp;=\sum_{k=0}^{n}\binom{n}{k}(e^tp)^k(1-p)^{n-k}\\&amp;amp;=(1-p+pe^t)^n\end{align}\]
Then \(M_X^{(1)}(t)=n(1-p+pe^t)^{n-1}pe^t|_{t=0}=np=E(X)\)\[\begin{align}M_X^{(2)}(t)&amp;amp;=n(n-1)(1-p+pe^t)^{n-2}pe^tpe^t+n(1-p+pe^t)^{n-1}pe^t|_{t=0}\\&amp;amp;=n(n-1)p^2+np=E(X^2)\end{align}\]
Then \(Var(X)=E(X^2)-(E(X))^2=n(n-1)p^2+np-(np)^2=-np^2+np=np(1-p)\)
For the binomial random variable X:\(P_X(k)=\binom{n}{k}p^k(1-p)^{n-k},\quad k=0,1,2,...,n\), if \(n\to+\infty\) with \(\lambda=np\) remains constant, then\[\begin{align}\lim_{n\to+\infty}\binom{n}{k}p^k(1-p)^{n-k}&amp;amp;=\lim_{n\to+\infty}\frac{n!}{k!(n-k)!}(\frac{\lambda}{n})^k(1-\frac{\lambda}{n})^{n-k}\\&amp;amp;=\lim_{n\to+\infty}\frac{n!}{k!(n-k)!}\lambda^k(\frac{1}{n})^k(1-\frac{\lambda}{n})^n(1-\frac{\lambda}{n})^{-k}\\&amp;amp;=\frac{\lambda^k}{k!}\lim_{n\to+\infty}\frac{n!}{(n-k)!}(\frac{1}{n})^k(\frac{n}{n-\lambda})^k(1-\frac{\lambda}{n})^n\\&amp;amp;=e^{-\lambda}\frac{\lambda^k}{k!}\lim_{n\to+\infty}\frac{n!}{(n-k)!}(\frac{1}{n-\lambda})^k\\&amp;amp;=e^{-\lambda}\frac{\lambda^k}{k!}\lim_{n\to+\infty}\frac{n(n-1)...(n-k+1)}{(n-\lambda)(n-\lambda)...(n-\lambda)}\\&amp;amp;=e^{-\lambda}\frac{\lambda^k}{k!}\end{align}\]
The moment-generating function of Poisson random variable X is:\[\begin{align}M_X(t)=E(e^{tX})&amp;amp;=\sum_{k=0}^{n}e^{tk}e^{-\lambda}\frac{\lambda^k}{k!}\\&amp;amp;=e^{-\lambda}\sum_{k=0}^{n}\frac{(\lambda e^t)^k}{k!}\\&amp;amp;=e^{-\lambda}e^{\lambda e^t}\\&amp;amp;=e^{\lambda e^t-\lambda}\end{align}\]</description>
    </item>
    
    <item>
      <title>The Gamma random variable denotes the waiting time for a Poisson event also the sum of Exponential events</title>
      <link>/2020/08/24/the-gamma-random-variable-denotes-the-waiting-time-for-the-rth-poisson-event/</link>
      <pubDate>Mon, 24 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/24/the-gamma-random-variable-denotes-the-waiting-time-for-the-rth-poisson-event/</guid>
      <description>The Gamma random variable denotes the waiting time for the \(r^{th}\) Poisson event, and also denotes the sum of r Exponential random variables. The sum of m Gamma random variables (shared the same parameter \(\lambda\)) is a Gamma random variable, which denotes the waiting time for the \((\sum_{i=1}^{m} r_i)^{th}\) Poisson event, and also denotes the sum of \(\sum_{i=1}^{m} r_i\) Exponential random variables.
let Y denote the waiting time to the occurrence of the \(r^{th}\) Poisson event,the probability fewer than r Poisson events occur in [0, y] time period is\(P(Y&amp;gt;y)=\sum_{k=0}^{r-1}e^{-\lambda y}\frac{(\lambda y)^k}{k!</description>
    </item>
    
    <item>
      <title>The Gamma and Beta functions</title>
      <link>/2020/08/21/the-gamma-and-beta-functions/</link>
      <pubDate>Fri, 21 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/21/the-gamma-and-beta-functions/</guid>
      <description>The Gamma function:\[\Gamma(s)=\int_{0}^{+\infty}t^{s-1}e^{-t}dt\quad \Bigl(=(s-1)! \quad s\in \mathbb Z^+\Bigr) (0&amp;lt;s&amp;lt;\infty)\] Because \[\begin{align}\Gamma(s+1)&amp;amp;=\int_{0}^{+\infty}t^{s}e^{-t}dt\\&amp;amp;=-\int_{0}^{+\infty}t^{s}d(e^{-t})\\&amp;amp;=-\Biggl[t^{s}e^{-t}|_{0}^{\infty}-\int_{0}^{+\infty}st^{s-1}e^{-t}dt\Biggl]\\&amp;amp;=-\Biggl[0-s\Gamma(s)\Biggl]\\&amp;amp;=s\Gamma(s)\end{align}\] and \[\Gamma(1)=\int_{0}^{+\infty}t^{1-1}e^{-t}dt=\int_{0}^{+\infty}e^{-t}dt=1\]The product of two Gamma functions:\[\begin{align}\Gamma(x)\Gamma(y)&amp;amp;=\int_{0}^{+\infty}u^{x-1}e^{-u}du\int_{0}^{+\infty}v^{y-1}e^{-v}dv\\&amp;amp;=\int_{u=0}^{+\infty}\int_{v=0}^{+\infty}e^{-(u+v)}u^{x-1}v^{y-1}dudv \quad (let\quad u+v=z; \quad u/z=t; \quad v/z=1-t; \quad dudv=zdtdz)\\&amp;amp;=\int_{z=0}^{+\infty}\int_{t=0}^{t=1}e^{-z}(zt)^{x-1}(z(1-t))^{y-1}zdtdz\\&amp;amp;=\int_{z=0}^{+\infty}e^{-z}z^{(x+y-1)}dz\int_{t=0}^{t=1}t^{(x-1)}(1-t)^{(y-1)}dt\\&amp;amp;=\Gamma(x+y)\int_{t=0}^{t=1}t^{(x-1)}(1-t)^{(y-1)}dt\end{align}\]
We define this integral \(\int_{t=0}^{t=1}t^{(x-1)}(1-t)^{(y-1)}dt\) as \(B(x,y),\quad (x&amp;gt;0 ;\quad y&amp;gt;0)\), this is the Beta function.\(B(x,y)=\frac{\Gamma(x)\Gamma(y)}{\Gamma(x+y)}\) \(\Bigl(=\frac{(x-1)!(y-1)!}{(x+y-1)!}\quad x;y\in \mathbb Z^+ \Bigr)\) this is the complete Beta function.</description>
    </item>
    
    <item>
      <title>How to derive the beautiful probability density function (pdf) of Normal Distribution?</title>
      <link>/2020/08/13/distributions/</link>
      <pubDate>Thu, 13 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/13/distributions/</guid>
      <description>How can we derive the probability density function (pdf) of Normal Distribution?\[f_Y(y)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{y-\mu}{\sigma})^2}, \quad -\infty&amp;lt;y&amp;lt;+\infty\]
Let’s draw a normal pdf first
#draw normal pdfx &amp;lt;- seq(-5, 5, length.out = 201); dx &amp;lt;- diff(x)[1]y &amp;lt;- dnorm(x, mean = 0, sd = 1)base::plot(x, y, type = &amp;quot;l&amp;quot;, col = &amp;quot;skyblue&amp;quot;,xlab=&amp;quot;x&amp;quot; , ylab=&amp;quot;p(x)&amp;quot; , cex.lab=1.5,main=&amp;quot;Normal Probability Density&amp;quot; , cex.main=1.5,lwd=2)text( 0, .6*max(y) , bquote( paste(mu ,&amp;quot; = 0 &amp;quot;) ), cex=1.</description>
    </item>
    
    <item>
      <title>Running wsl commands using system2() function in R</title>
      <link>/2020/08/12/running-wsl-commands-using-system2-function-in-r/</link>
      <pubDate>Wed, 12 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>/2020/08/12/running-wsl-commands-using-system2-function-in-r/</guid>
      <description>Accession number NC_045512 in Fasta format.Using “wsl” command in system2() to run commands in wsl
system2(&amp;quot;wsl&amp;quot;, &amp;quot;cd ~/bioinfor/; ls&amp;quot;, stdout = TRUE)## [1] &amp;quot;AF086833.gb&amp;quot; &amp;quot;NC_045512-version1.fa&amp;quot; &amp;quot;RNASeqByExample&amp;quot; ## [4] &amp;quot;chr22.fa&amp;quot; &amp;quot;runinfo.csv&amp;quot;We can retrieve the SARS-coronavirus 2 gene sequences using efetch
system2(&amp;quot;wsl&amp;quot;,&amp;quot;efetch -db=nuccore -format=gb -id=NC_045512&amp;quot;, stdout = &amp;quot;../../../NC_045512.gb&amp;quot;)Accession number NC_045512 in Fasta format.system2(&amp;quot;wsl&amp;quot;,&amp;quot;efetch -db=nuccore -format=fasta -id=NC_045512 &amp;gt; NC_045512.fa&amp;quot;, stdout = TRUE)## character(0)system2(&amp;quot;wsl&amp;quot;, &amp;quot;cat .</description>
    </item>
    
    <item>
      <title>About this site</title>
      <link>/about/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/about/</guid>
      <description>This website was constructed using R markdown and R blogdown, the wonderful packages by Yihui Xie, and the Hugo Lithium theme. These pages are hosted by GitHub Pages.
Interesting blogs which I followed:
colahChris ChoyTerence TaoPiotr MigdałLak Lakshmanan1. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021. https://www.R-project.org/.2. Xie Y, Hill AP, Thomas A.</description>
    </item>
    
    <item>
      <title>Curriculum Vitae</title>
      <link>/vitae/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/vitae/</guid>
      <description>Contact Information– Email: dan@danli.org;
– Homepage:http://danli.org/;
– Orcid:orcid;
– Github:https://github.com/danli349;
– Twitter:@LiDan;
– StackExchange:StackExchange;
– Biostars: Biostars;</description>
    </item>
    
  </channel>
</rss>