-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.xml
2504 lines (1983 loc) · 112 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>A Hugo website</title>
<link>/</link>
<description>Recent content on A Hugo website</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Thu, 28 Dec 2023 00:00:00 +0000</lastBuildDate><atom:link href="/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>An Introduction to Generalized Linear Models (4th edition)</title>
<link>/2023/12/28/an-introduction-to-generalized-linear-models/</link>
<pubDate>Thu, 28 Dec 2023 00:00:00 +0000</pubDate>
<guid>/2023/12/28/an-introduction-to-generalized-linear-models/</guid>
<description>Chapter2 Model Fitting2.5 ExercisesChapter3 Exponential Family and Generalized Linear ModelsExercisesChapter4 EstimationChapter5 InferenceChapter6 Normal Linear ModelsChapter7 Binary Variables and Logistic RegressionChapter8 Nominal and Ordinal Logistic Regression8.2 Multinomial distributionExercisesChapter9 Poisson Regression and Log-Linear Models9.2 Poisson regressionChapter10 Survival AnalysisChapter11 Clustered and Longitudinal DataChapter12 Bayesian AnalysisChapter13 Markov Chain Monte Carlo MethodsChapter14 Example Bayesian AnalysesChapter2 Model Fittinglibrary(dobson)library(ggprism)library(tidyverse)birthweight## # A tibble: 12 × 4## `boys gestational age` `boys weight` `girls gestational age` `girls weight`## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;## 1 40 2968 40 3317## 2 38 2795 36 2729## 3 40 3163 40 2935## 4 35 2925 38 2754## 5 36 2625 42 3210## 6 37 2847 39 2817## 7 41 3292 40 3126## 8 40 3473 37 2539## 9 37 2628 36 2412## 10 38 3176 38 2991## 11 40 3421 39 2875## 12 38 2975 40 3231dim(birthweight)## [1] 12 4library(tidyverse)birthweight |&gt; ggplot(aes(x=`boys gestational age`,y=`boys weight`)) + geom_point(shape=1, size=3) + geom_point(aes(x=`girls gestational age`, y=`girls weight`), shape=19, size=3) +theme_bw() + theme(# Hide panel borders and remove grid lines#panel.</description>
</item>
<item>
<title>Common statistical tests are linear models</title>
<link>/2023/09/24/common-statistical-tests-are-linear-models-or-how-to-teach-stats/</link>
<pubDate>Sun, 24 Sep 2023 00:00:00 +0000</pubDate>
<guid>/2023/09/24/common-statistical-tests-are-linear-models-or-how-to-teach-stats/</guid>
<description>Interpretation of R’s lm() outputFive point summaryCoefficients and \(\hat{\beta_i}s\)\(t\)-statisticsResidual standard errorAdjusted \(R^2\)\(F\)-statisticThe simplicity underlying common testsSettings and toy dataPearson and Spearman correlationTheory: As linear modelsTheory: rank-transformationR code: Pearson correlationR code: Spearman correlationOne meanOne sample t-test and Wilcoxon signed-rankTheory: As linear modelsR code: One-sample t-testR code: Wilcoxon signed-rank testPaired samples t-test and Wilcoxon matched pairsTheory: As linear modelsR code: Paired sample t-testR code: Wilcoxon matched pairsTwo meansIndependent t-test and Mann-Whitney UTheory: As linear modelsTheory: Dummy codingTheory: Dummy coding (continued)R code: independent t-testR code: Mann-Whitney UWelch’s t-testThree or more meansOne-way ANOVA and Kruskal-WallisTheory: As linear modelsExample dataR code: one-way ANOVAR code: Kruskal-WallisTwo-way ANOVATheory: As linear modelsR code: Two-way ANOVAANCOVAProportions: Chi-square is a log-linear modelGoodness of fitTheory: As log-linear modelExample dataR code: Goodness of fitContingency tablesTheory: As log-linear modelExample dataR code: Chi-square testSources and further equivalencesExplicit GLM(M) Equivalents for Standard TestsExplicit GLM Test: PoissonBinomial Test: Logistic RegressionClassical Test: Exact Binomial TestExplicit GLM: Logit (or Probit)Probability density function of Logistic distributionProportion Test: (Multinomial) Logistic or Poisson ModelClassical Test: Test for Equality of ProportionsExplicit GLM: LogitExplicit GLM: PoissonClassical Test: Poisson TestReferences#share-buttons img {width: 40px;padding-right: 15px;border: 0;box-shadow: 0;display: inline;vertical-align: top;}# Options for building this documentknitr::opts_chunk$set(fig.</description>
</item>
<item>
<title>Introducing Monte Carlo Methods with R</title>
<link>/2023/08/28/introducing-monte-carlo-methods-with-r/</link>
<pubDate>Mon, 28 Aug 2023 00:00:00 +0000</pubDate>
<guid>/2023/08/28/introducing-monte-carlo-methods-with-r/</guid>
<description>1. Basic R ProgrammingBivariate Normal distributionsthe t value of Pearson correlationKolmogorov-Smirnov Goodness-of-Fit TestShapiro–Wilk normality testWilcoxon signed rank testNewton’s method for calculating the square root1. Basic R Programminglibrary(MASS)e &lt;- c(1:5)d &lt;- c(6:10)e*d## [1] 6 14 24 36 50t(e)*d## [,1] [,2] [,3] [,4] [,5]## [1,] 6 14 24 36 50sum(t(e)*d)## [1] 130t(e)%*%d## [,1]## [1,] 130d%*%t(e)## [,1] [,2] [,3] [,4] [,5]## [1,] 6 12 18 24 30## [2,] 7 14 21 28 35## [3,] 8 16 24 32 40## [4,] 9 18 27 36 45## [5,] 10 20 30 40 50x1=matrix(1:20,nrow=5) #build the numeric matrix x1 of dimension#5 4 with rst row 1, 6, 11, 16x2=matrix(1:20,nrow=5,byrow=T) #build the numeric matrix x2 of dimension#5 4 with rst row 1, 2, 3, 4x3=t(x2) #transpose the matrix x2b=x3%*%x2b## [,1] [,2] [,3] [,4]## [1,] 565 610 655 700## [2,] 610 660 710 760## [3,] 655 710 765 820## [4,] 700 760 820 880sum(b)## [1] 11380c=x2%*%x3c## [,1] [,2] [,3] [,4] [,5]## [1,] 30 70 110 150 190## [2,] 70 174 278 382 486## [3,] 110 278 446 614 782## [4,] 150 382 614 846 1078## [5,] 190 486 782 1078 1374sum(c)## [1] 11150m &lt;- runif(16)m2 &lt;- matrix(m, nrow=4)m2## [,1] [,2] [,3] [,4]## [1,] 0.</description>
</item>
<item>
<title>Modern Statistics for Modern Biology-2</title>
<link>/2023/04/29/modern-statistics-for-modern-biology-2/</link>
<pubDate>Sat, 29 Apr 2023 00:00:00 +0000</pubDate>
<guid>/2023/04/29/modern-statistics-for-modern-biology-2/</guid>
<description>5 Clustering6 Testing7 Multivariate Analysis8 High-Throughput Count Data9 Multivariate methods for heterogeneous data5 Clustering## -----------------------------------------------------------------------------library(&quot;MASS&quot;)library(&quot;RColorBrewer&quot;)set.seed(101)n &lt;- 60000S1=matrix(c(1,.72,.72,1), ncol=2)S2=matrix(c(1.5,-0.6,-0.6,1.5),ncol=2)mu1=c(.5,2.5)mu2=c(6.5,4)X1 = mvrnorm(n, mu=c(.5,2.5), Sigma=matrix(c(1,.72,.72,1), ncol=2))X2 = mvrnorm(n,mu=c(6.5,4), Sigma=matrix(c(1.5,-0.6,-0.6,1.5),ncol=2))# A color palette from blue to yellow to redk = 11my.cols &lt;- rev(brewer.pal(k, &quot;RdYlBu&quot;))plot(X1, xlim=c(-4,12),ylim=c(-2,9), xlab=&quot;Orange&quot;, ylab=&quot;Red&quot;, pch=&#39;.</description>
</item>
<item>
<title>Modern Statistics for Modern Biology-3</title>
<link>/2023/04/29/modern-statistics-for-modern-biology-3/</link>
<pubDate>Sat, 29 Apr 2023 00:00:00 +0000</pubDate>
<guid>/2023/04/29/modern-statistics-for-modern-biology-3/</guid>
<description>10 Networks and Trees11 Image data12 Supervised Learning13 Design of High Throughput Experiments and their Analyses10 Networks and Trees## -----------------------------------------------------------------------------dats = read.table(&quot;../data/small_chemokine.txt&quot;, header = TRUE)library(&quot;ggtree&quot;)## ggtree v3.4.0 For help: https://yulab-smu.top/treedata-book/## ## If you use the ggtree package suite in published research, please cite## the appropriate paper(s):## ## Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam.</description>
</item>
<item>
<title>Modern Statistics for Modern Biology-1</title>
<link>/2023/04/28/modern-statistics-for-modern-biology-1/</link>
<pubDate>Fri, 28 Apr 2023 00:00:00 +0000</pubDate>
<guid>/2023/04/28/modern-statistics-for-modern-biology-1/</guid>
<description>1 Generative Models for Discrete Data2 Statistical Modeling3 High Quality Graphics in R4 Mixture Models1 Generative Models for Discrete Data## -----------------------------------------------------------------------------dpois(x = 3, lambda = 5)## [1] 0.1403739## -----------------------------------------------------------------------------.oldopt = options(digits = 2)0:12## [1] 0 1 2 3 4 5 6 7 8 9 10 11 12dpois(x = 0:12, lambda = 5)## [1] 0.</description>
</item>
<item>
<title>Neural Networks and Deep Learning</title>
<link>/2022/01/16/neural-networks-and-deep-learning/</link>
<pubDate>Sun, 16 Jan 2022 00:00:00 +0000</pubDate>
<guid>/2022/01/16/neural-networks-and-deep-learning/</guid>
<description>1. Using neural nets to recognize handwritten digits2. How the backpropagation algorithm works3. Improving the way neural networks learn3.1 The sigmoid output and cross-entropy cost function3.2 Overfitting and regularization3.3 Weight initialization3.5 How to choose a neural network’s hyper-parameters?4. A visual proof that neural nets can compute any function5. Why are deep neural networks hard to train?6. Deep learning6.</description>
</item>
<item>
<title>Longitudinal Analysis</title>
<link>/2021/10/27/longitudinal-analysis/</link>
<pubDate>Wed, 27 Oct 2021 00:00:00 +0000</pubDate>
<guid>/2021/10/27/longitudinal-analysis/</guid>
<description>3. Overview of Linear Models for Longitudinal DataReferences3. Overview of Linear Models for Longitudinal DataReferences1. Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis. John Wiley &amp; Sons; 2012.2. Singer JD, Willett JB, Willett JB, et al. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford university press; 2003.</description>
</item>
<item>
<title>Stochastic Processes</title>
<link>/2021/10/25/stochastic-processes/</link>
<pubDate>Mon, 25 Oct 2021 00:00:00 +0000</pubDate>
<guid>/2021/10/25/stochastic-processes/</guid>
<description>1. Measure and IntegrationMeasurable SpacesMeasurable FunctionsMeasuresIntegrationTransforms and Indefinite IntegralsKernels and Product Spaces2. Probability SpacesProbability Spaces and RandomExpectationsLp-spaces and Uniform IntegrabilityInformation and DeterminabilityIndependence3. Convergence4. Conditioning5. Martingales and Stochastics6. Poisson Random Measures7. L´evy Processes8. Brownian Motion9. Markov ProcessesReferences1. Measure and IntegrationMeasurable SpacesMonotone class theoremLet \(\mathcal C\) be a class of subset closed under finite intersections and containing \(\Omega\) (that is, \(\mathcal C\) is a \(\pi\)-system).</description>
</item>
<item>
<title>Probability</title>
<link>/2021/10/10/probability/</link>
<pubDate>Sun, 10 Oct 2021 00:00:00 +0000</pubDate>
<guid>/2021/10/10/probability/</guid>
<description>ReferencesReferences1. Blitzstein JK, Hwang J. Introduction to probability. Chapman; Hall/CRC; 2019.</description>
</item>
<item>
<title>Statistical Inference</title>
<link>/2021/09/05/statistical-inference/</link>
<pubDate>Sun, 05 Sep 2021 00:00:00 +0000</pubDate>
<guid>/2021/09/05/statistical-inference/</guid>
<description>1. Probability TheorySet TheoryBasics of Probability TheoryConditional Probability and IndependenceRandom VariablesDistribution FunctionsDensity and Mass Functions2. Transformations and ExpectationsDistributions of Functions of a Random Variable Theorem3. Common Families of DistributionsContinuouS DistributionsGamma DistributionNormal DistributionChi-Squared DistributionStudent’s \(t\)-DistributionSnedcor’s \(F\)-DistributionMultinomial DistributionExponential FamiliesLocation and Scale FamiliesInequalities and Identities4.</description>
</item>
<item>
<title>Generalized Linear Models</title>
<link>/2021/08/10/generalized-linear-models/</link>
<pubDate>Tue, 10 Aug 2021 00:00:00 +0000</pubDate>
<guid>/2021/08/10/generalized-linear-models/</guid>
<description>1. Introduction to Linear and Generalized Linear ModelsReferences1. Introduction to Linear and Generalized Linear ModelsReferences1. Neter J, Kutner MH, Nachtsheim CJ, Wasserman W, others. Applied linear statistical models. 1996.2. Agresti A. Foundations of linear and generalized linear models. John Wiley &amp; Sons; 2015.</description>
</item>
<item>
<title>Hands-on Machine Learning: Keras-TensorFlow</title>
<link>/2021/06/21/hands-on-machine-learning-keras/</link>
<pubDate>Mon, 21 Jun 2021 00:00:00 +0000</pubDate>
<guid>/2021/06/21/hands-on-machine-learning-keras/</guid>
<description>Chapter 10 – Introduction to Artificial Neural Networks with KerasPerceptronsThe Multilayer Perceptron (MLP) and BackpropagationActivation functionsRegression MLPClassification MLPsImplementing MLPs with KerasBuilding an Image Classifier Using the Sequential APIBuilding Complex Models Using the Functional APIUsing the Subclassing API to Build Dynamic ModelsSaving and Restoring a ModelUsing Callbacks during TrainingUsing TensorBoard for VisualizationFine-Tuning Neural Network HyperparametersExercise solutionsChapter 11 – Training Deep Neural NetworksVanishing/Exploding Gradients ProblemGlorot and He InitializationNonsaturating Activation FunctionsBatch NormalizationImplement batch normalization with kerasGradient ClippingReusing Pretrained LayersTransfer Learning with KerasFaster OptimizersMomentum optimizationNesterov Accelerated GradientAdaGradRMSPropAdam OptimizationAdamax OptimizationNadam OptimizationLearning Rate SchedulingPower SchedulingExponential SchedulingPiecewise Constant SchedulingPerformance Schedulingtf.</description>
</item>
<item>
<title>Hands-on Machine Learning: Scikit-Learn</title>
<link>/2021/06/06/hands-on-machine-learning/</link>
<pubDate>Sun, 06 Jun 2021 00:00:00 +0000</pubDate>
<guid>/2021/06/06/hands-on-machine-learning/</guid>
<description>Chapter 1 – The Machine Learning landscapeExample 1-1. Training and running a linear model using Scikit-LearnExercisesChapter 2 – End-to-end Machine Learning projectWorking with Real DataLook at the Big PictureDiscover and visualize the data to gain insightsPrepare the data for Machine Learning algorithmsSelect and train a modelFine-Tune Your ModelExtra materialA full pipeline with both preparation and predictionModel persistence using joblibExample SciPy distributions for RandomizedSearchCVExercise solutions1.</description>
</item>
<item>
<title>Introduction to Algorithms: Foundations</title>
<link>/2021/06/04/introduction-to-algorithms/</link>
<pubDate>Fri, 04 Jun 2021 00:00:00 +0000</pubDate>
<guid>/2021/06/04/introduction-to-algorithms/</guid>
<description>1 The Role of Algorithms in Computing1.1 AlgorithmsExercises1.1-11.1-21.1-31.1-41.1-51.2 Algorithms as a technologyExercises1.2-11.2-21.2-3Problems1-1 Comparison of running timesReferences1 The Role of Algorithms in ComputingWhat are algorithms? Why is the study of algorithms worthwhile? What is the roleof algorithms relative to other technologies used in computers?</description>
</item>
<item>
<title>Structure and Interpretation of Computer Programs (SICP)</title>
<link>/2021/06/03/structure-and-interpretation-of-computer-programs-sicp/</link>
<pubDate>Thu, 03 Jun 2021 00:00:00 +0000</pubDate>
<guid>/2021/06/03/structure-and-interpretation-of-computer-programs-sicp/</guid>
<description>ReferencesReferenceshttps://github.com/rmculpepper/iracket
1. Abelson H, Sussman GJ. Structure and interpretation of computer programs. The MIT Press; 1996.</description>
</item>
<item>
<title>Bayesian Thinking</title>
<link>/2021/05/25/bayesian-thinking/</link>
<pubDate>Tue, 25 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/25/bayesian-thinking/</guid>
<description>1 The Basics of Bayesian StatisticsBayes’ RuleConditional Probabilities &amp; Bayes’ RuleBayes’ Rule and Diagnostic TestingBayes UpdatingBayesian vs. Frequentist Definitions of ProbabilityInference for a Proportion: Frequentist ApproachInference for a Proportion: Bayesian ApproachEffect of Sample Size on the PosteriorFrequentist vs. Bayesian Inference2 Bayesian InferenceContinuous Variables and Eliciting Probability DistributionsFrom the Discrete to the ContinuousElicitationConjugacyInference on a Binomial ProportionThe Gamma-Poisson Conjugate FamiliesThe Normal-Normal Conjugate FamiliesNon-Conjugate PriorsCredible IntervalsPredictive Inference3 Losses and Decision MakingBayesian Decision MakingPosterior Probabilities of Hypotheses and Bayes Factors4 Inference and Decision-Making with Multiple ParametersConjugate Prior for \(\mu\) and \(\sigma^2\)Conjugate Posterior DistributionMarginal Distribution for \(\mu\): Student \(t\)Credible Intervals for \(\mu\)Example: TTHM in TapwaterSection Summary(Optional) DerivationsMonte Carlo InferenceMonte Carlo SamplingMonte Carlo Inference: Tap Water ExampleMonte Carlo Inference for Functions of ParametersSummaryPrior Predictive DistributionTap Water Example (continued)Sampling from the Prior Predictive in RPosterior PredictiveSummaryMarkov Chain Monte Carlo (MCMC)5 Hypothesis Testing with Normal PopulationsBayes Factors for Testing a Normal Mean: variance knownComparing Two Paired Means using Bayes FactorsComparing Independent Means: Hypothesis TestingInference after Testing6 Introduction to Bayesian RegressionBayesian Simple Linear RegressionFrequentist Ordinary Least Square (OLS) Simple Linear RegressionBayesian Simple Linear Regression Using the Reference PriorInformative Priors(Optional) Derivations of Marginal Posterior Distributions of \(\alpha\), \(\beta\), \(\sigma^2\)Marginal Posterior Distribution of \(\beta\)Marginal Posterior Distribution of \(\alpha\)Marginal Posterior Distribution of \(\sigma^2\)Joint Normal-Gamma Posterior DistributionsPosterior Distribution of \(\epsilon_j\) Conditioning On \(\sigma^2\)Implementation Using BAS PackageBayesian Multiple Linear RegressionThe ModelData Pre-processingSpecify Bayesian Prior DistributionsFitting the Bayesian ModelPosterior Means and Posterior Standard DeviationsCredible Intervals SummarySummary7 Bayesian Model ChoiceDefinition of BICBackward Elimination with BICCoefficient Estimates Under Reference Prior for Best BIC ModelOther CriteriaModel UncertaintyCalculating Posterior Probability in RBayesian Model AveragingVisualizing Model UncertaintyBayesian Model Averaging Using Posterior ProbabilityCoefficient Summary under BMASummary8 Stochastic Explorations Using MCMCMarkov Chain Monte Carlo ExplorationOther Priors for Bayesian Model UncertaintyZellner’s \(g\)-PriorBayes Factor of Zellner’s \(g\)-PriorKid’s Cognitive Score ExampleThe UScrime Data Set and Data ProcessingBayesian Models and DiagnosticsPosterior Uncertainty in CoefficientsPredictionModel ChoicePrediction with New DataSummaryReferences1 The Basics of Bayesian StatisticsBayesian statistics mostly involves conditional probability, which is the the probability of an event A given event B, and it can be calculated using the Bayes rule.</description>
</item>
<item>
<title>AOS chapter25 Simulation Methods</title>
<link>/2021/05/24/aos-chapter25-simulation-methods/</link>
<pubDate>Mon, 24 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/24/aos-chapter25-simulation-methods/</guid>
<description>25. Simulation Methods25.1 Bayesian Inference Revisited25.2 Basic Monte Carlo Integration25.3 Importance Sampling25.4 MCMC Part I: The Metropolis-Hastings Algorithm25.5 MCMC Part II: Different Flavors25.7 ExercisesReferences25. Simulation MethodsIn this chapter we will see that by generating data in a clever way, we can solve a number of problems such as integrating or maximizing a complicated function. For integration, we will study 3 methods:</description>
</item>
<item>
<title>AOS chapter24 Stochastic Processes</title>
<link>/2021/05/22/aos-chapter24-stochastic-processes/</link>
<pubDate>Sat, 22 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/22/aos-chapter24-stochastic-processes/</guid>
<description>24. Stochastic Processes24.1 Introduction24.2 Markov Chains24.3 Poisson Process24.6 ExercisesReferences24. Stochastic Processes24.1 IntroductionA stochastic process \(\{ X_t : t \in T \}\) is a collection of random variables. We shall sometimes write \(X(t)\) instead of \(X_t\). The variables \(X_t\) take values in some set \(\mathcal{X}\) called the state space. The set \(T\) is called the index set and for our purposes can be thought of as time.</description>
</item>
<item>
<title>AOS chapter23 Classification</title>
<link>/2021/05/20/aos-chapter23-classification/</link>
<pubDate>Thu, 20 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/20/aos-chapter23-classification/</guid>
<description>23. Classification23.1 Introduction23.2 Error Rates and The Bayes Classifier23.3 Gaussian and Linear Classifiers23.4 Linear Regression and Logistic Regression23.5 Relationship Between Logistic Regression and LDA23.6 Density Estimation and Naive Bayes23.7 Trees23.8 Assessing Error Rates and Choosing a Good Classifier23.9 Support Vector Machines23.10 Kernelization23.11 Other Classifiers23.13 ExercisesReferences23. Classification23.1 IntroductionThe problem of predicting a discrete variable \(Y\) from another random variable \(X\) is called classfication, supervised learning, discrimination or pattern recognition.</description>
</item>
<item>
<title>AOS chapter22 Smoothing Using Orthogonal Functions</title>
<link>/2021/05/17/aos-chapter22-smoothing-using-orthogonal-functions/</link>
<pubDate>Mon, 17 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/17/aos-chapter22-smoothing-using-orthogonal-functions/</guid>
<description>22. Smoothing Using Orthogonal Functions22.1 Orthogonal Functions and \(L_2\) Spaces22.2 Density Estimation22.3 Regression22.4 Wavelets22.6 ExercisesReferences22. Smoothing Using Orthogonal FunctionsIn this Chapter we study a different approach to nonparametric curve estimation based on orthogonal functions. We begin with a brief introduction to the theory of orthogonal functions. Then we turn to density estimation and regression.
22.1 Orthogonal Functions and \(L_2\) SpacesLet \(v = (v_1, v_2, v_3)\) denote a three dimensional vector.</description>
</item>
<item>
<title>AOS chapter21 Nonparametric Curve Estimation</title>
<link>/2021/05/09/aos-chapter21-nonparametric-curve-estimation/</link>
<pubDate>Sun, 09 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/09/aos-chapter21-nonparametric-curve-estimation/</guid>
<description>21. Nonparametric Curve Estimation21.1 The Bias-Variance Tradeoff21.2 Histograms21.3 Kernel Density Estimation21.4 Nonparametric Regression21.5 Appendix: Confidence Sets and Bias21.7 ExercisesReferences21. Nonparametric Curve EstimationIn this Chapter we discuss the nonparametric estimation of probability density functions and regression functions, which we refer to a curve estimation.
In Chapter 8 we saw it is possible to consistently estimate a cumulative distribution function \(F\) without making any assumptions about \(F\).</description>
</item>
<item>
<title>Density Estimation</title>
<link>/2021/05/05/density-estimation/</link>
<pubDate>Wed, 05 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/05/density-estimation/</guid>
<description>1. INTROUCTION2. SURVEY OF EXISTING METHODS2.1 Introduction2.2. Histograms2.3. The naive estimator2.4. The kernel estimator2.5. The nearest neighbour method2.6. The variable kernel method2.7. Orthogonal series estimators2.8. Maximum penalized likelihood estimators2.9. General weight function estimators1. INTROUCTIONReferences1. INTROUCTIONThe probability density function is a fundamental concept in statistics. Consider any random quantity \(X\) that has probability density function \(f\).</description>
</item>
<item>
<title>AOS chapter20 Directed Graphs</title>
<link>/2021/05/03/aos-chapter20-directed-graphs/</link>
<pubDate>Mon, 03 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/03/aos-chapter20-directed-graphs/</guid>
<description>20. Directed Graphs20.1 Introduction20.2 DAG’s20.3 Probability and DAG’s20.4 More Independence Relations20.5 Estimation for DAG’s20.6 Causation Revisited20.8 Exercises20. Directed Graphs20.1 IntroductionDirected graphs are similar to undirected graphs, but there are arrows between vertices instead of edges. Like undirected graphs, directed graphs can be used to represent independence relations. They can also be used as an alternative to counterfactuals to represent causal relationships.</description>
</item>
<item>
<title>AOS chapter19 Causal Inference</title>
<link>/2021/05/01/aos-chapter19-causal-inference/</link>
<pubDate>Sat, 01 May 2021 00:00:00 +0000</pubDate>
<guid>/2021/05/01/aos-chapter19-causal-inference/</guid>
<description>19. Causal Inference19.1 The Counterfactual Model19.2 Beyond Binary Treatments19.3 Observational Studies and Confounding19.4 Simpson’s Paradox19.6 ExercisesReferences19. Causal InferenceIn this chapter we discuss causation. Roughly speaking “\(X\) causes \(Y\)” means that changing the value of \(X\) will change the distribution of \(Y\). When \(X\) causes \(Y\), \(X\) and \(Y\) will be associated but the reverse is not, in general, true.</description>
</item>
<item>
<title>AOS chapter18 Loglinear Models</title>
<link>/2021/04/30/aos-chapter18-loglinear-models/</link>
<pubDate>Fri, 30 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/30/aos-chapter18-loglinear-models/</guid>
<description>18. Loglinear Models18.1 The Loglinear Model18.2 Graphical Log-Linear Models18.3 Hierarchical Log-Linear Models18.4 Model Generators18.5 Lattices18.6 Fitting Log-Linear Models to Data18.8 ExercisesReferences18. Loglinear Models18.1 The Loglinear ModelLet \(X = (X_1, \dots, X_m)\) be a random vector with probability
\[ f(x) = \mathbb{P}(X = x) = \mathbb{P}(X_1 = x_1, \dots, X_m = x_m) \]</description>
</item>
<item>
<title>AOS chapter17 Undirected Graphs and Conditional Independence</title>
<link>/2021/04/26/aos-chapter16-undirected-graphs-and-conditional-independence/</link>
<pubDate>Mon, 26 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/26/aos-chapter16-undirected-graphs-and-conditional-independence/</guid>
<description>17. Undirected Graphs and Conditional Independence17.1 Conditional Independence17.2 Undirected Graphs17.3 Probability and Graphs17.4 Fitting Graphs to Data17.6 ExercisesReferences17. Undirected Graphs and Conditional Independence\(k\) binary variables \(Y_1, \dots, Y_k\) correspond to a multinomial with \(N = 2^k\) categories. Even for moderately large \(k\), \(2^k\) will be huge. It can be shown in this case that the MLE is a poor estimator, because the data are sparse.</description>
</item>
<item>
<title>AOS chapter16 Inference about Independence</title>
<link>/2021/04/25/aos-chapter16-inference-about-independence/</link>
<pubDate>Sun, 25 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/25/aos-chapter16-inference-about-independence/</guid>
<description>16. Inference about Independence16.1 Two Binary Variables16.2 Interpreting the Odds Ratios16.3 Two Discrete Variables16.4 Two Continuous Variables16.5 One Continuous Variable and One Discrete16.7 ExercisesReferences16. Inference about IndependenceThis chapter addresses two questions:
How do we test if two random variables are independent?How do we estimate the strength of dependence between two random variables?Recall we write \(Y \text{ ⫫ } Z\) to mean that \(Y\) and \(Z\) are independent.</description>
</item>
<item>
<title>AOS chapter15 Multivariate Models</title>
<link>/2021/04/24/aos-chapter15-multivariate-models/</link>
<pubDate>Sat, 24 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/24/aos-chapter15-multivariate-models/</guid>
<description>15. Multivariate Models15.1 Random Vectors15.2 Estimating the Correlation15.3 Multinomial15.4 Multivariate Normal15.5 Appendix15.6 Exercisesbox_mullerReferences15. Multivariate ModelsReview of notation from linear algebra:
If \(x\) and \(y\) are vectors, then \(x^T y = \sum_j x_j y_j\).If \(A\) is a matrix then \(\text{det}(A)\) denotes the determinant of \(A\), \(A^T\) denotes the transpose of A, and \(A^{-1}\) denotes the inverse of \(A\) (if the inverse exists).</description>
</item>
<item>
<title>AOS chapter14 Linear Regression</title>
<link>/2021/04/23/aos-chapter14-linear-regression/</link>
<pubDate>Fri, 23 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/23/aos-chapter14-linear-regression/</guid>
<description>14. Linear Regression14.1 Simple Linear Regression14.2 Least Squares and Maximum Likelihood14.3 Properties of the Least Squares Estimators14.4 Prediction14.5 Multiple Regression14.6 Model Selection14.7 The Lasso14.8 Technical Appendix14.9 ExercisesReferences14. Linear RegressionRegression is a method for studying the relationship between a response variable \(Y\) and a covariates \(X\). The covariate is also called a predictor variable or feature.</description>
</item>
<item>
<title>AOS chapter13 Statistical Decision Theory</title>
<link>/2021/04/22/aos-chapter13-statistical-decision-theory/</link>
<pubDate>Thu, 22 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/22/aos-chapter13-statistical-decision-theory/</guid>
<description>13. Statistical Decision Theory13.1 Preliminaries13.2 Comparing Risk Functions13.3 Bayes Estimators13.4 Minimax Rules13.5 Maximum Likelihood, Minimax and Bayes13.6 Admissibility13.7 Stein’s Paradox13.9 ExercisesReferences13. Statistical Decision Theory13.1 PreliminariesDecision theory is a formal theory for comparing between statistical procedures.
In the language of decision theory, a estimator is sometimes called a decision rule and the possible values of the decision rule are called actions.</description>
</item>
<item>
<title>AOS chapter12 Bayesian Inference</title>
<link>/2021/04/21/aos-chapter12-bayesian-inference/</link>
<pubDate>Wed, 21 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/21/aos-chapter12-bayesian-inference/</guid>
<description>12. Bayesian Inference12.1 Bayesian Philosophy12.2 The Bayesian Method12.3 Functions of Parameters12.4 Simulation12.5 Large Sample Properties for Bayes’ Procedures12.6 Flat Priors, Improper Priors and “Noninformative” Priors12.7 Multiparameter Problems12.8 Strenghts and Weaknesses of Bayesian Inference12.9 Appendix12.11 ExercisesReferences12. Bayesian Inference12.1 Bayesian PhilosophyPostulates of frequentist (or classical) inference:
Probabilty refers to limiting relative frequencies.</description>
</item>
<item>
<title>AOS Chapter11 Hypothesis Testing and p-values</title>
<link>/2021/04/20/aos-chapter11-hypothesis-testing-and-p-values/</link>
<pubDate>Tue, 20 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/20/aos-chapter11-hypothesis-testing-and-p-values/</guid>
<description>11. Hypothesis Testing and p-values11.1 The Wald Test11.2 p-values11.3 The \(\chi^2\) distribution11.4 Pearson’s \(\chi^2\) Test for Multinomial Data11.5 The Permutation Test11.6 Multiple Testing11.7 Technical Appendix11.9 ExercisesReferences11. Hypothesis Testing and p-valuesSuppose we partition the parameters space \(\Theta\) into two disjoint sets \(\Theta_0\) and \(\Theta_1\) and we wish to test
\[H_0: \theta \in \Theta_0\quad \text{versus} \quadH_1: \theta \in \Theta_1\]</description>
</item>
<item>
<title>AOS chapter10 Parametric Inference</title>
<link>/2021/04/18/aos-chapter10-parametric-inference/</link>
<pubDate>Sun, 18 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/18/aos-chapter10-parametric-inference/</guid>
<description>10. Parametric Inference10.1 Parameter of interest10.2 The Method of Moments10.3 Maximum Likelihood10.4 Properties of Maximum Likelihood Estimators10.5 Consistency of Maximum Likelihood Estimator10.6 Equivalence of the MLE10.7 Asymptotic Normality10.8 Optimality10.9 The Delta Method10.10 Multiparameter Models10.11 The Parametric Bootstrap10.12 Technical Appendix10.13 ExercisesReferences10. Parametric InferenceParametric models are of the form</description>
</item>
<item>
<title>AOS Chapter09 The Bootstrap</title>
<link>/2021/04/17/aos-chapter09-the-bootstrap/</link>
<pubDate>Sat, 17 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/17/aos-chapter09-the-bootstrap/</guid>
<description>9. The Bootstrap9.1 Simulation9.2 Bootstrap Variance Estimation9.3 Bootstrap Confidence Intervals9.5 Technical Appendix9.6 ExercisesReferences9. The BootstrapLet \(X_1, \dots, X_n \sim F\) be random variables distributed according to \(F\), and
\[ T_n = g(X_1, \dots, X_n)\]
be a statistic, that is, any function of the data. Suppose we want to know \(\mathbb{V}_F(T_n)\), the variance of \(T_n\).
For example, if \(T_n = n^{-1}\sum_{i=1}^nX_i\) then \(\mathbb{V}_F(T_n) = \sigma^2/n\) where \(\sigma^2 = \int (x - \mu)^2dF(x)\) and \(\mu = \int x dF(x)\).</description>
</item>
<item>
<title>AOS Chapter08 Estimating the CDF and Statistical Functionals</title>
<link>/2021/04/16/aos-chapter08-estimating-the-cdf-and-statistical-functionals/</link>
<pubDate>Fri, 16 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/16/aos-chapter08-estimating-the-cdf-and-statistical-functionals/</guid>
<description>8. Estimating the CDF and Statistical Functionals8.1 Empirical distribution function8.2 Statistical Functionals8.3 Technical Appendix8.5 ExercisesReferences8. Estimating the CDF and Statistical Functionals8.1 Empirical distribution functionThe empirical distribution function \(\hat{F_n}\) is the CDF that puts mass \(1/n\) at each data point \(X_i\). Formally,
\[\begin{align}\hat{F_n}(x) &amp; = \frac{\sum_{i=1}^n I\left(X_i \leq x \right)}{n} \\&amp;= \frac{\text{#}|\text{observations less than or equal to x}|}{n}\end{align}\]</description>
</item>
<item>
<title>AOS Chapter07 Models, Statistical Inference and Learning</title>
<link>/2021/04/15/aos-chapter07-models-statistical-inference-and-learning/</link>
<pubDate>Thu, 15 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/15/aos-chapter07-models-statistical-inference-and-learning/</guid>
<description>7. Models, Statistical Inference and Learning7.2 Parametric and Nonparametric Models7.3 Fundamental Concepts in Inference7.5 Technical AppendixReferences7. Models, Statistical Inference and Learning7.2 Parametric and Nonparametric ModelsA statistical model is a set of distributions \(\mathfrak{F}\).
A parametric model is a set \(\mathfrak{F}\) that may be parametrized by a finite number of parameters. For example, if we assume that data comes from a normal distribution then</description>
</item>
<item>
<title>AOS Chapter06 Convergence of Random Variables</title>
<link>/2021/04/14/aos-chapter05-convergence-of-random-variables/</link>
<pubDate>Wed, 14 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/14/aos-chapter05-convergence-of-random-variables/</guid>
<description>6. Convergence of Random Variables6.2 Types of convergence6.3 The Law of Large Numbers6.4 The Central Limit Theorem6.5 The Delta Method6.6 Technical appendix6.8 ExercisesReferences6. Convergence of Random Variables6.2 Types of convergence\(X_n\) converges to \(X\) in probability, written \(X_n \xrightarrow{\text{P}} X\), if, for every \(\epsilon &gt; 0\),:
\[ \mathbb{P}( |X_n - X| &gt; \epsilon ) \rightarrow 0 \]</description>
</item>
<item>
<title>AOS Chapter05 Inequalities</title>
<link>/2021/04/13/aos-chapter05-inequalities/</link>
<pubDate>Tue, 13 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/13/aos-chapter05-inequalities/</guid>
<description>5. Inequalities5.1 Markov and Chebyshev Inequalities5.2 Hoeffding’s Inequality5.3 Cauchy-Schwartz and Jensen Inequalities5.4 Technical Appendix: Proof of Hoeffding’s Inequality5.6 ExercisesReferences5. Inequalities5.1 Markov and Chebyshev InequalitiesTheorem 5.1 (Markov’s Inequality). Let \(X\) be a non-negative random variable and suppose that \(\mathbb{E}(X)\) exists. For any \(t &gt; 0\),
\[ \mathbb{P}(X &gt; t) \leq \frac{\mathbb{E}(X)}{t} \]
Proof.
\[ \mathbb{E}(X)=\int_0^\infty xf(x) dx=\int_0^t xf(x) dx + \int_t^\infty xf(x) dx\geq \int_t^\infty xf(x) dx\geq t \int_t^\infty f(x) dx= t \mathbb{P}(X &gt; t)\]</description>
</item>
<item>
<title>AOS chapter04 Expectation, negative binomial distribution and gene counts, beta distribution and Order Statistics</title>
<link>/2021/04/12/aos-chapter04-expectation/</link>
<pubDate>Mon, 12 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/12/aos-chapter04-expectation/</guid>
<description>4. Expectation4.1 Expectation of a Random Variable4.2 Properties of Expectations4.3 Variance and Covariance4.4 Expectation and Variance of Important Random Variables4.5 Conditional Expectation4.6 Technical Appendix4.7 Exercises4.8 Negative binomial (or gamma-Poisson) distribution and gene expression counts modeling4.9 Beta distribution and Order Statistics4.10 The conditional log-likelihood (CML) of NB distributionReferences4. Expectation4.1 Expectation of a Random VariableThe expected value, mean or first moment of \(X\) is defined to be</description>
</item>
<item>
<title>AOS chapter03 Random Variables</title>
<link>/2021/04/11/aos-chapter03-random-variables/</link>
<pubDate>Sun, 11 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/11/aos-chapter03-random-variables/</guid>
<description>3. Random Variables3.1 Introduction3.2 Distribution Functions and Probability Functions3.3 Some Important Discrete Random Variables3.4 Some Important Continuous Random Variables3.5 Bivariate Distributions3.6 Marginal Distributions3.7 Independent Random Variables3.8 Conditional Distributions3.9 Multivariate Distributions and IID Samples3.10 Two Important Multivariate Distributions3.11 Transformations of Random Variables3.12 Transformation of Several Random Variables3.13 Technical Appendix3.14 ExercisesReferences3.</description>
</item>
<item>
<title>AOS chapter02 Probability</title>
<link>/2021/04/09/aos-chapter02-probability/</link>
<pubDate>Fri, 09 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/09/aos-chapter02-probability/</guid>
<description>2. Probability2.2 Sample Spaces and Events2.3 Probability2.4 Probability on Finite Sample Spaces2.5 Independent Events2.6 Conditional Probability2.7 Bayes’ Theorem2.9 Technical Appendix2.10 ExercisesReferences2. Probability2.2 Sample Spaces and EventsThe sample space \(\Omega\) is the set of possible outcomes of an experiment. Points \(\omega\) in \(\Omega\) are called sample outcomes or realizations. Events are subsets of \(\Omega\).</description>
</item>
<item>
<title>ESL chapter 4 Linear Methods for Classification</title>
<link>/2021/04/03/esl-chapter-4-linear-methods-for-classification/</link>
<pubDate>Sat, 03 Apr 2021 00:00:00 +0000</pubDate>
<guid>/2021/04/03/esl-chapter-4-linear-methods-for-classification/</guid>
<description>Chapter 4. Linear Methods for Classification\(\S\) 4.1. IntroductionLinear regressionDiscriminant functionLogit transformationSeparating hyperplanesScope for generalization\(\S\) 4.2. Linear Regression of an Indicator MatrixRationaleA more simplistic viewpointMasked class with the regression approach\(\S\) 4.3. Linear Discriminant AnalysisLDA from multivariate GaussianEstimating parametersSimple correspondence between LDA and linear regression with two classesPractice beyond the Gaussian assumptionQuadratic Discriminant AnalysisWhy LDA and QDA have such a good track record?</description>
</item>
<item>
<title>ESL chapter 3 exercises</title>
<link>/2021/03/23/esl-chapter-3-exercises/</link>
<pubDate>Tue, 23 Mar 2021 00:00:00 +0000</pubDate>
<guid>/2021/03/23/esl-chapter-3-exercises/</guid>
<description>Ex. 3.9 (using the QR decomposition for fast forward-stepwise selection)Ex. 3.10 (using the z-scores for fast backwards stepwise regression)Ex. 3.11 (multivariate linear regression with different \(\Sigma_i\))Ex. 3.12 (ordinary least squares to implement ridge regression)Ex. 3.13 (principal component regression)Ex. 3.14 (when the inputs are orthogonal PLS stops after m = 1 step)Ex. 3.15 (PLS seeks directions that have high variance and high correlation)Relation to the optimization problemEx.</description>
</item>
<item>
<title>ESL chapter 3 Linear Methods for Regression</title>
<link>/2021/02/24/esl-chapter-3-linear-methods-for-regression/</link>
<pubDate>Wed, 24 Feb 2021 00:00:00 +0000</pubDate>
<guid>/2021/02/24/esl-chapter-3-linear-methods-for-regression/</guid>
<description>Chapter 3. Linear Methods for Regression\(\S\) 3.1. Introduction\(\S\) 3.2. Linear Regression Models and Least SquaresThe linear modelLeast squares fitSolution of least squaresGeometrical representation of the least squares estimateSampling properties of \(\hat{\beta}\)Inference and hypothesis testingConfidence intervals\(\S\) 3.2.1. Example: Prostate Cancer\(\S\) 3.2.2. The Gauss-Markov TheoremThe statement of the theoremImplications of the Gauss-Markov theoremRelation between prediction accuracy and MSE\(\S\) 3.</description>
</item>
<item>
<title>ESL chapter 2 Overview of Supervised Learning</title>
<link>/2021/02/12/esl-chapter-2-overview-of-supervised-learning/</link>
<pubDate>Fri, 12 Feb 2021 00:00:00 +0000</pubDate>
<guid>/2021/02/12/esl-chapter-2-overview-of-supervised-learning/</guid>
<description>\(\S\) Supervised Learning\(\S\) 2.3. Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors\(\S\) 2.3.3 From Least Squares to Nearest Neighbors\(\S\) 2.3.1. Linear Models and Least SquaresLinear ModelsHow to fit the model: Least squaresLinear model in a classification contextWhere the data came from?\(\S\) 2.3.2 Nearest-Neighbor MethodsDo not satisfy with the training resultsEffective number of parametersDo not appreciate the training errors\(\S\) 2.</description>
</item>
<item>
<title>Single cell data analysis using scanpy</title>
<link>/2021/02/03/single-cell-data-analysis-using-scanpy/</link>
<pubDate>Wed, 03 Feb 2021 00:00:00 +0000</pubDate>
<guid>/2021/02/03/single-cell-data-analysis-using-scanpy/</guid>
<description>import numpy as npimport pandas as pdimport scanpy as sc# !mkdir data# !wget http://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz -O data/pbmc3k_filtered_gene_bc_matrices.tar.gz# !cd data; tar -xzf pbmc3k_filtered_gene_bc_matrices.tar.gz# !mkdir writesc.settings.verbosity = 3 # verbosity: errors (0), warnings (1), info (2), hints (3)sc.logging.print_header()sc.settings.set_figure_params(dpi=80, facecolor=&#39;white&#39;)scanpy==1.6.1 anndata==0.7.5 umap==0.4.6 numpy==1.19.2 scipy==1.5.2 pandas==1.2.1 scikit-learn==0.23.2 statsmodels==0.12.1 python-igraph==0.8.3 leidenalg==0.8.3results_file = &#39;write/pbmc3k.h5ad&#39; # the file that will store the analysis resultsadata = sc.</description>
</item>
<item>
<title>The C Programming Language</title>
<link>/2021/02/03/c/</link>
<pubDate>Wed, 03 Feb 2021 00:00:00 +0000</pubDate>
<guid>/2021/02/03/c/</guid>
<description>CHAPTER 1: A Tutorial Introduction1.1 Getting StartedExercise 1-1 .Exercise 1-2 .1.2 Variables and Arithmetic ExpressionsExercise 1-3 .Exercise 1-4 .1.3 The For StatementExercise 1-5 .1.4 Symbolic Constants1.5 Character Input and Output1.5.1 File CopyingExercise 1-6 .Exercise 1-7 .1.5.2 Character Counting1.5.3 Line CountingExercise 1-8 . Write a program to count blanks, tabs, and newlines.</description>
</item>
<item>
<title>Single cell RNA-seq data analysis using Markov Affinity-Based Graph Imputation</title>
<link>/2021/01/28/single-cell-rna-seq-data-analysis-using-markov-affinity-based-graph-imputation/</link>
<pubDate>Thu, 28 Jan 2021 00:00:00 +0000</pubDate>
<guid>/2021/01/28/single-cell-rna-seq-data-analysis-using-markov-affinity-based-graph-imputation/</guid>
<description>import magicimport pandas as pdimport matplotlib.pyplot as pltX = pd.read_csv(&quot;test_data.csv&quot;)X.shape(500, 197)magic_operator = magic.MAGIC()X_magic = magic_operator.fit_transform(X, genes=[&#39;VIM&#39;, &#39;CDH1&#39;, &#39;ZEB1&#39;])X_magic.shapeCalculating MAGIC...Running MAGIC on 500 cells and 197 genes.Calculating graph and diffusion operator...Calculating PCA...Calculated PCA in 0.02 seconds.Calculating KNN search...Calculated KNN search in 0.04 seconds.Calculating affinities...Calculated affinities in 0.03 seconds.Calculated graph and diffusion operator in 0.</description>
</item>
<item>
<title>Algorithms in SICP</title>
<link>/2021/01/22/algorithms-in-sicp/</link>
<pubDate>Fri, 22 Jan 2021 00:00:00 +0000</pubDate>
<guid>/2021/01/22/algorithms-in-sicp/</guid>
<description>1 Building Abstractions with Procedures1.2 Procedures and the Processes They Generate1.2.1 Linear Recursion and Iteration1.3 Formulating Abstractions with Higher-Order Procedures1.3.1 Procedures as Arguments1.3.2 Constructing Procedures Using lambda1.3.3 Procedures as General Methods1.3.4 Procedures as Returned Values2 Building Abstractions with Data2.1 Introduction to Data Abstraction2.1.1 Example: Arithmetic Operations for Rational Numbers2.</description>
</item>
<item>
<title>Topology</title>
<link>/2021/01/17/topology/</link>
<pubDate>Sun, 17 Jan 2021 00:00:00 +0000</pubDate>
<guid>/2021/01/17/topology/</guid>
<description>If \(X\) is a topological space with topology \(\mathscr T\), we say that a subset \(U\) of \(X\) is anopen set of \(X\) if \(U\) belongs to the collection \(\mathscr T\). Using this terminology, one can say that a topological space is a set \(X\) together with a collection of subsets of \(X\), called open sets, such that \(\varnothing\) and \(X\) are both open, and such that arbitrary unions and finite intersections of open sets are open.</description>
</item>
<item>
<title>Banach space</title>
<link>/2021/01/03/banach-space/</link>
<pubDate>Sun, 03 Jan 2021 00:00:00 +0000</pubDate>
<guid>/2021/01/03/banach-space/</guid>
<description>A complex vector space \(X\) is said to be a normed linear space if to each \(x\in X\), there is associated a nonnegative real number \(\lVert x\rVert\), called the norm of \(x\), such that
\(\lVert x+y\rVert\leq\lVert x\rVert+\lVert y\rVert\) for all \(x\) and \(y\in X\),
\(\lVert ax\rVert=|a|\lVert x\rVert\) if \(x\in X\) and \(a\) is a scalar.
\(\lVert x\rVert=0\) implies \(x=0\).
Every normed linear space may be regarded as a metric space, the distance between \(x\) and \(y\) being \(\lVert x-y\rVert\).</description>
</item>
<item>
<title>Trigonometric Series</title>
<link>/2021/01/02/trigonometric-series/</link>
<pubDate>Sat, 02 Jan 2021 00:00:00 +0000</pubDate>
<guid>/2021/01/02/trigonometric-series/</guid>
<description>Let \(T\) be the unit circle in the complex plane, i.e., the set of all complex numbers of absolute value \(1\). If \(F\) is any function on \(T\) and if \(f\) is defined on \(R^1\) by \[f(t)=F(e^{it})\] Then \(f\) is a periodic function of period \(2\pi\). Conversely, if \(f\) is a function on \(R^1\), with period \(2\pi\), then there is a function \(F\) on \(T\) such that \[f(t)=F(e^{it})\] holds.</description>
</item>
<item>
<title>Hilbert Space</title>
<link>/2020/12/31/hilbert-space/</link>
<pubDate>Thu, 31 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/31/hilbert-space/</guid>
<description>A complex vector space \(H\) is called an inner product space if to each ordered pair of vectors \(x,y\in H\) there is associated a complex number \((x,y)\), the so-called “inner product” of \(x\) and \(y\), such that the following rules hold:
\((a)\) \((y,x)=\overline{(x,y)}\). (The bar denotes complex conjugation.)
\((b)\) \((x+y,z)=(x,z)+(y,z),\quad x,y,z\in H\)
\((c)\) \((ax,y)=a(x,y),\quad x,y\in H, a\text{ is a scalar}\)
\((d)\) \((x,x)\ge0,\quad \forall x\in H\)
\((e)\) \((x,x)=0\) only if \(x=0\).</description>
</item>
<item>
<title>L^p-Spaces</title>
<link>/2020/12/29/l-p-spaces/</link>
<pubDate>Tue, 29 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/29/l-p-spaces/</guid>
<description>A real function \(\varphi\) defined on a segment \((a,b),\quad -\infty\leq a&lt;b\leq \infty\), is called convex if the inequality \[\varphi((1-\lambda)x+\lambda y)\leq(1-\lambda)\varphi(x)+\lambda\varphi(y)\] or equivalently \[\frac{\varphi(t)-\varphi(s)}{t-s}\leq\frac{\varphi(u)-\varphi(t)}{u-t},\quad a&lt;s&lt;t&lt;u&lt;b\] holds whenever \(a&lt;x,y&lt;b,\quad 0\leq\lambda\leq1\). If \(x&lt;t&lt;y\), then the point \((t,\varphi(t))\) should lie below or on the connecting the points \((x,\varphi(x))\) and \((y,\varphi(y))\) in the plane.
If \(\varphi\) is convex on \((a, b)\) then \(\varphi\) is continuous on \((a, b)\).
Suppose \(a&lt;s&lt;x&lt;y&lt;t&lt;b\), write \(S\) for the point \((s,\varphi(s))\) in the plane, and deal similarly with \(x\), \(y\), and \(t\).</description>
</item>
<item>
<title>Positive Borel Measures</title>
<link>/2020/12/22/positive-borel-measures/</link>
<pubDate>Tue, 22 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/22/positive-borel-measures/</guid>
<description>The support of a complex function \(f\) on a topological space \(X\) is the closure of the set \[\{x:f(x)\ne0\}\] The collection of all continuous complex functions on \(X\) whose support is compact is denoted by \(C_c(X)\), which is a vector space. The notation \[K\prec f\] means that \(K\) is a compact subset of \(X\), that \(f\in C_c(X),\;\;0\leq f(x)\leq 1\) for all \(x\in X\) and that \(f(x)=1\) for all \(x\in K\).</description>
</item>
<item>
<title>Measures</title>
<link>/2020/12/16/measures/</link>
<pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/16/measures/</guid>
<description>拓扑含空全交并,
开由开得即连续。
西代含空全补并,
开由可测即可测。
开由博雷即博雷,
可测博雷由可测。
Topology contains empty, whole, intersection and union sets,
continuous function maps onto open sets from open sets.
\(\sigma-slgebra\) contains empty, whole, complement and union sets,
measurable function maps onto open sets from measurable set.
Borel mapping maps onto open sets from Borel sets,
measurable function maps onto Borel sets from subset of \(\sigma-slgebra\) (measurable sets).
The empty set is \(\varnothing\). A collection \(\tau\) of subsets of a set \(X\) is said to be a topology in \(X\) if \(\tau\) has the following properties: (i) \(\varnothing\in\tau, X\in\tau\).</description>
</item>
<item>
<title>Lebesgue Theory</title>
<link>/2020/12/10/lebesgue-theory/</link>
<pubDate>Thu, 10 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/10/lebesgue-theory/</guid>
<description>\(A\) and \(B\) are two sets, we write \(A-B\) for the set of all elements \(x\) such that \(x\in A, x\notin B\). A family \(\mathscr R\) of set is called a ring if \(A\in\mathscr R, B\in\mathscr R\) implies \[A\cup B\in\mathscr R,\quad A-B\in\mathscr R, \quad A\cap B=A-(A-B)\in\mathscr R\] A ring \(\mathscr R\) is called a \(\sigma\)-ring if \[\overset{\infty}{\underset{n=1}{\bigcup}}A_n\in\mathscr R\] whenever \(A_n\in\mathscr R(n=1,2,\cdots)\). And if \(\mathscr R\) is a \(\sigma\)-ring, \[\overset{\infty}{\underset{n=1}{\bigcap}}A_n=A_1-\overset{\infty}{\underset{n=1}{\bigcup}}(A_1-A_n)\in\mathscr R\]</description>
</item>
<item>
<title>Closed forms and exact forms</title>
<link>/2020/12/08/closed-forms-and-exact-forms/</link>
<pubDate>Tue, 08 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/08/closed-forms-and-exact-forms/</guid>
<description>Let \(\omega\) be a \(k\)-form in an open set \(E\subset R^n\). If there is a \((k-1)\)-form \(\lambda\) in \(E\) such that \(\omega=d\lambda\), then \(\omega\) is said to be exact in \(E\). If \(\omega\) is of class \(\mathscr C&#39;\) and \(d\omega=0\), then \(\omega\) is said to be closed.
If \(\omega\) is of class \(\mathscr C&#39;&#39;\) in \(E\), then \[d^2\omega=0\] For a \(0\)-form \(f\in\mathscr C&#39;&#39;(E)\) \[\begin{align}d^2f&amp;=d\Biggl(\sum_{j=1}^{n}(D_jf)(\mathbf x)dx_j\Biggr)\\&amp;=\sum_{j=1}^{n}d(D_jf)(\mathbf x)dx_j\\&amp;=\sum_{i=1,j=1}^{n}(D_{ij}f)(\mathbf x)dx_i\land dx_j\\\end{align}\] Since \(D_{ij}f=D_{ji}f\) and \(dx_i\land dx_j=-dx_j\land dx_i\) so \[d^2\omega=(d^2f)\land dx_I=0\] Then every exact form of class \(\mathscr C&#39;\) is closed.</description>
</item>
<item>
<title>Stokes' theorem</title>
<link>/2020/12/07/stokes-theorem/</link>
<pubDate>Mon, 07 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/07/stokes-theorem/</guid>
<description>If \(\Psi\) is a \(k\)-chain of class \(\mathscr C&#39;&#39;\) in an open set \(V\subset R^m\) and if \(\omega\) is a \((k-1)\)-form of class \(\mathscr C&#39;\) in \(V\), then \[\int_{\Psi}d\omega=\int_{\partial\Psi}\omega\]
The case \(k=m=1\) is the fundamental theorem of calculus: If \(f\in \mathscr R\) on \([a,b]\) and if there is a differentiable function \(F\) on \([a,b]\) such that \(F&#39;=f\), then \[\int_{a}^{b}f(x)dx=F(b)-F(a)\] Let \(\varepsilon&gt;0\), choose a partition \(P=\{x_0,\cdots,x_n\}\) of \([a,b]\) so that \[U(P,f)-L(P,f)&lt;\varepsilon\] let point \(t_i\in[x_{i-1},x_i]\) such that \[F(x_i)-F(x_{i-1})=f(t_i)\Delta x_i\] for \(i=1,\cdots,n\) Thus \[\sum_{i=1}^{n}f(t_i)\Delta x_i=F(b)-F(a)\] Then \[\Biggl|F(b)-F(a)-\int_{a}^{b}f(x)dx\Biggr|\leq \Biggl|F(b)-F(a)-\sum_{i=1}^{n}f(t_i)\Delta x_i\Biggr|+\Biggl|\sum_{i=1}^{n}f(t_i)\Delta x_i-\int_{a}^{b}f(x)dx\Biggr|&lt;0+\varepsilon=\varepsilon\] Then then \[\int_{a}^{b}f(x)dx=F(b)-F(a)\]</description>
</item>
<item>
<title>Affine simplexes and chains</title>
<link>/2020/12/04/affine-simplexes-and-chains/</link>
<pubDate>Fri, 04 Dec 2020 00:00:00 +0000</pubDate>
<guid>/2020/12/04/affine-simplexes-and-chains/</guid>
<description>A mapping \(\mathbf f\) that carries a vector space \(X\) into a vector space \(Y\) is said to be affine if \[\mathbf f(\mathbf x)-\mathbf f(\mathbf 0)\] is linear, or in other words \[\mathbf f(\mathbf x)-\mathbf f(\mathbf 0)=A\mathbf x\quad A\in L(X,Y)\] The standard simplex \(Q^k\) is defined to be the set of all \(\mathbf u\in R^k\) of the form \[\mathbf u=\sum_{i=1}^{k}a_i\mathbf e_i\] where \(0\leq a_i, \sum a_i\leq 1, i=1,\cdots,k\).</description>
</item>
<item>
<title>Differential Forms</title>
<link>/2020/11/27/differential-forms/</link>
<pubDate>Fri, 27 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/27/differential-forms/</guid>
<description>Suppose \(I^k\) is a k-cell in \(R^k\) consisting of all \[\mathbf x=(x_1,\cdots,x_k)\] such that \(a_i\leq x_i\leq b_i\quad(i=1,\cdots,k)\) and \(f\) is a real continuous function on \(I^k\). Put \(f=f_k\) and define \[f_{k-1}(x_1,\cdots,x_{k-1})=\int_{a_k}^{b_k}f_{k}(x_1,\cdots,x_{k-1},x_k)dx_k\] We repeat this process \(k\) steps and obtain a function, which is defined \[L(f)=\int_{I^k}f(\mathbf x)d\mathbf x\] or \[\int_{I^k}f\] If \(L&#39;(f)\) is the result obtained by carrying out the \(K\) integration in some other order, then for every \(f\in\mathscr C(I^k), L(f)=L&#39;(f)\).</description>
</item>
<item>
<title>Functions of several variables</title>
<link>/2020/11/23/functions-of-several-variables/</link>
<pubDate>Mon, 23 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/23/functions-of-several-variables/</guid>
<description>Let \(L(X,Y)\) be the set of all linear transformations of the vector space \(X\) into the vector space \(Y\). For \(A\in L(R^n,R^m)\), define the norm \(\lVert A\rVert\) of \(A\) to be the sup of all numbers \(\lvert A\mathbf x\rvert\), where \(\mathbf x\) ranges over all vectors in \(R^n\) with \(\lvert x\rvert\leq1\) The inequality \[\lvert A\mathbf x\rvert\leq\lVert A\rVert\lvert \mathbf x\rvert\] holds for all \(\mathbf x\in R^n\). If \(\lambda\) is such that \[\lvert A\mathbf x\rvert\leq\lambda\lvert \mathbf x\rvert\] for all \(\mathbf x\in R^n\) then \(\lVert A\rVert\leq\lambda\).</description>
</item>
<item>
<title>Fourier Series</title>
<link>/2020/11/21/fourier-series/</link>
<pubDate>Sat, 21 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/21/fourier-series/</guid>
<description>\[\cos(x)=\frac{1}{2}(e^{ix}+e^{-ix})\]\[\sin(x)=\frac{1}{2i}(e^{ix}-e^{-ix})\]\[e^{ix}=\cos(x)+i\sin(x)\]\[|e^{ix}|^2=e^{ix}\overline{e^{ix}}=e^{ix}e^{-ix}=1\] then \[|e^{inx}|=1\] Let \(x_0\) be the smallest positive number such that \(\cos(x_0)=0\), we define the number \(\pi\) by \(\pi=2x_0\). Then \(\cos(\pi/2)=0\) Because \(|\cos(\pi/2)+i\sin(\pi/2)|=\sqrt{\cos^2(\pi/2)+\sin^2(\pi/2)}=1\) then \(\sin^2(\pi/2)=1\) Since \(\sin&#39;(x)=\cos(x)&gt;0\) in \((0,\pi/2)\), \(\sin(x)\) is increasing in \((0,\pi/2)\), hence \(\sin(\pi/2)=1\). Thus \[e^{\frac{\pi}{2}i}=\cos(\pi/2)+i\sin(\pi/2)=i\] \[e^{\pi i}=\cos(\pi)+i\sin(\pi)=-1\]\[e^{-\pi i}=\cos(-\pi)+i\sin(-\pi)=-1\]\[e^{2\pi i}=\cos(2\pi)+i\sin(2\pi)=1\]\[e^{z+2\pi i}=\cos(z+2\pi)+i\sin(z+2\pi)=\cos(z)+i\sin(z)=e^z\quad(\text{z complex})\] Then \(e^{ix}\) is periodic, with period \(2\pi i\).\[\int_{-\pi}^{\pi} e^{inx}dx=\frac{e^{inx}}{in}\Biggl|_{-\pi}^{\pi}=\begin{cases}2\pi &amp; (\text{if } n=0) \\0 &amp; (\text{if } n=\pm1,\pm2,\cdots)\end{cases}\]</description>
</item>
<item>
<title>Exponential and Logarithmic function</title>
<link>/2020/11/20/exponential-and-logarithmic-function/</link>
<pubDate>Fri, 20 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/20/exponential-and-logarithmic-function/</guid>
<description>\[e=\sum_{n=0}^{\infty}\frac{1}{n!}\quad(n=0,1,2,3,\cdots)\]
\[\begin{align}(1+\frac{1}{n})^n&amp;=\sum_{m=0}^{n}{n \choose m}\cdot 1^{m}\cdot(\frac{1}{n})^{n-m}\\&amp;=\sum_{m=0}^{n}{n \choose m}\cdot(\frac{1}{n})^{n-m}\\&amp;={n \choose n}+{n \choose n-1}(\frac{1}{n})+{n \choose n-2}(\frac{1}{n})^2+\cdots+{n \choose 0}(\frac{1}{n})^n\\&amp;=1+n(\frac{1}{n})+\frac{n(n-1)}{2!}(\frac{1}{n})^2+\cdots+\frac{n!}{n!}(\frac{1}{n})^n\\&amp;=1+1+\frac{1}{2!}(1-\frac{1}{n})+\cdots+\frac{1}{n!}(1-\frac{1}{n})(1-\frac{2}{n})\cdots(1-\frac{n-1}{n})\\&amp;\leq \sum_{k=0}^{n}\frac{1}{k!}\leq e\end{align}\] Next if \(n\ge m\), \[(1+\frac{1}{n})^n=1+1+\frac{1}{2!}(1-\frac{1}{n})+\cdots+\frac{1}{n!}(1-\frac{1}{n})(1-\frac{2}{n})\cdots(1-\frac{n-1}{n})\\\ge 1+1+\frac{1}{2!}(1-\frac{1}{n})+\cdots+\frac{1}{m!}(1-\frac{1}{n})(1-\frac{2}{n})\cdots(1-\frac{m-1}{n})\\\ge 1+1+\frac{1}{2!}+\cdots+\frac{1}{m!}\\=\sum_{k=0}^{m}\frac{1}{k!}\] let \(n\to\infty\), keep \(m\) fixed, we get \[\lim_{n\to\infty}\text{inf}(1+\frac{1}{n})^n\ge \sum_{k=0}^{m}\frac{1}{k!}\] when \(m\to\infty\) \[\lim_{n\to\infty}\text{inf}(1+\frac{1}{n})^n\ge e\] Then \[\lim_{n\to\infty}(1+\frac{1}{n})^n=e\]
For fixed rational number \(z\), \[\begin{align}\lim_{n\to\infty}(1+\frac{z}{n})^n&amp;=\Bigl[\lim_{n\to\infty}(1+\frac{1}{n/z})^{n/z}\Bigr]^{z}\\&amp;=e^z\end{align}\]
The Ratio Test for series convergent: The series \(\sum a_n\) converges if \[\lim_{n\to\infty}\text{sup}\Biggl|\frac{a_{n+1}}{a_n}\Biggr|&lt;1\] and diverges if \[\Biggl|\frac{a_{n+1}}{a_n}\Biggr|\ge1\] for all \(n\ge n_0\) where \(n_0\) is some fixed integer.</description>
</item>
<item>
<title>Power series</title>
<link>/2020/11/18/power-series/</link>
<pubDate>Wed, 18 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/18/power-series/</guid>
<description>The power series are \[\sum_{n=0}^{\infty}c_nx^n\] The numbers \(c_n\) are called coefficients. \[R=\frac{1}{\displaystyle\lim_{n\to \infty}\text{sup}\sqrt[n]{|c_n|}}\] is called the radius of convergence of power series \[\sum_{n=0}^{\infty}c_nx^n\] \[\displaystyle\lim_{n\to \infty}\text{sup}\sqrt[n]{|c_nx^n|}=\frac{|x|}{R}\] Then \[\sum_{n=0}^{\infty}c_nx^n\] converges if \(|x|&lt;R\) and diverges if \(|x|&gt;R\).
Suppose the series \(\sum_{n=0}^{\infty}c_nx^n\) converges for \(|x|&lt;R\), then it converges uniformly on \([-R+\varepsilon,R-\varepsilon], \varepsilon&gt;0\). For \(|x|\leq R-\varepsilon\) we have \[|c_nx^n|\leq|c_n(R-\varepsilon)^n|\] and since \[\sum c_n(R-\varepsilon)^n\] converges absolutely (every power series converges absolutely in the interior of its radius), then there is an integer \(N\) that \[|\sum_{i=0}^{n}c_ix^i-\sum_{i=0}^{m}c_ix^i|=|\sum_{m+1}^{n}c_ix^i|&lt;\varepsilon,\quad n\ge m\ge N\] then \(\sum_{n=0}^{\infty}c_nx^n\) converges uniformly.</description>
</item>
<item>
<title>Sequences and Series of functions</title>
<link>/2020/11/18/sequences-and-series-of-functions/</link>
<pubDate>Wed, 18 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/18/sequences-and-series-of-functions/</guid>
<description>A sequence of functions \(\{f_n\}, n=1,2,3,\cdots,\) defined on \(E\), and suppose the sequence of numbers \(\{f_n(x)\}\) converges for every \(x\in E\), we define the function \(f\) by \[f(x)=\lim_{n\to \infty}f_n(x)\quad (x\in E)\] We say that \(\{f_n\}\) converges to \(f\) pointwise on \(E\) and \(f\) is the limit function.
A sequence of functions \(\{f_n\}, n=1,2,3,\cdots,\) converges uniformly on \(E\) to a function \(f\) if for every \(\epsilon&gt;0\) there is an integer \(N\) such that \(n&gt;N\) implies \[|f_n(x)-f(x)|\le\epsilon\] for all \(x\in E\).</description>
</item>
<item>
<title>L'Hospital's rule and Taylor's theorem</title>
<link>/2020/11/15/l-hospital-s-rule-and-taylor-s-theorem/</link>
<pubDate>Sun, 15 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/15/l-hospital-s-rule-and-taylor-s-theorem/</guid>
<description>If \(f\) be defined on \([a,b]\) and for \(x\in[a,b]\) the limit \[f&#39;(x)=\underset{t\to x}{\lim}\frac{f(t)-f(x)}{t-x}\] exists, we say that \(f\) is differentiable at \(x\). If \(f&#39;\) is defined at every point of a set \(E\subset[a,b]\), we say that \(f\) is differentiable on \(E\).
The mean value theorem If \(f\) and \(g\) are continuous real functions on \([a,b]\) which are differentiable in \((a,b)\), then there is a point \(x\in(a,b)\) at which \[\frac{f(b)-f(a)}{g(b)-g(a)}=\frac{f&#39;(x)}{g&#39;(x)}\] Put \(h(t)=[f(b)-f(a)]g(t)-[g(b)-g(a)]f(t)\quad(a\le t\le b)\) then \(h\) is continuous on \([a,b]\) and differentiable in \((a,b)\), and \(h(a)=[f(b)-f(a)]g(a)-[g(b)-g(a)]f(a)=f(b)g(a)-g(b)f(a)=h(b)\) To prove this theorem, we have to show that \(h&#39;(x)=0\) for some \(x\in(a,b)\).</description>
</item>
<item>
<title>Riemann-Stieltjes integral</title>
<link>/2020/11/15/riemann-stieltjes-integral/</link>
<pubDate>Sun, 15 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/15/riemann-stieltjes-integral/</guid>
<description>A partition P of interval \([a,b]\) is a finite set of points \(x_0,x_1,\cdots,x_n\), where \(a=x_0\le x_1\le \cdots\le x_n=b\) and \(\Delta x_i=x_i-x_{i-1}\quad(i=1,\cdots,n)\). Let \[M_i=\text{sup }f(x)\quad(x_{i-1}\le x\le x_i)\] \[m_i=\text{inf }f(x)\quad(x_{i-1}\le x\le x_i)\] Corresponding to each partition \(P\) of \([a,b]\). We put \[U(P,f)=\sum_{i=1}^{n}M_i\Delta x_i\] \[L(P,f)=\sum_{i=1}^{n}m_i\Delta x_i\] \[\overline{\int}_{a}^{b} f dx=\text{inf}\sum_{i=1}^{n}M_i\Delta x_i\] \[\underline{\int}_{a}^{b} f dx=\text{sup}\sum_{i=1}^{n}m_i\Delta x_i\]which are called upper and lower Riemann integrals of \(f\) over \([a,b]\), respectively. If the upper and lower integrals are equal \[\overline{\int}_{a}^{b} f dx=\underline{\int}_{a}^{b} f dx\] we say that \(f\) is Riemann-integrable on \([a,b]\), we write \(f\in\mathscr R\).</description>
</item>
<item>
<title>Continuity</title>
<link>/2020/11/12/continuity/</link>
<pubDate>Thu, 12 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/12/continuity/</guid>
<description>If a set \(E\) in \(R^k\) is closed and bounded, then \(E\subset I\) for some compact k-cell \(I\), then \(E\) is the closed subset of compact set \(I\) and \(E\) is also compact. A bounded infinite set \(E\) in \(R^k\) is a subset of a compact k-cell \(I\), and \(E\) must have a limit point in \(I\) or \(R^k\). E is compact implies every infinite subset \(K\) of E has a limit point in E, and which implies E is closed and bounded.</description>
</item>
<item>
<title>Numerical Sequences and Series</title>
<link>/2020/11/10/numerical-sequences-and-series/</link>
<pubDate>Tue, 10 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/10/numerical-sequences-and-series/</guid>
<description>A sequence \(\{p_n\}\) in a metric space \(X\) is said to converge if there is a point \(p\in X\) with the following property: For every \(\epsilon&gt;0\) there is an integer \(N\) such that \(n\ge N\) implies that \(d(p_n,p)&lt;\epsilon\) and we also say \(\{p_n\}\) converges to \(p\) or \(p\) is the limit of \(\{p_n\}\) and we write \[\lim_{x\to \infty}p_n=p\] or \(p_n\to p\). A sequence \(\{p_n\}\) in a metric space \(X\) is said to be a Cauchy sequence if for every \(\epsilon&gt;0\), there is an integer \(N\) such that \(d(p_n,p_m)&lt;\epsilon\) if \(n\ge N\) and \(m\ge N\).</description>
</item>
<item>
<title>Set theory</title>
<link>/2020/11/06/set-theory/</link>
<pubDate>Fri, 06 Nov 2020 00:00:00 +0000</pubDate>
<guid>/2020/11/06/set-theory/</guid>
<description>In metric space, a neighborhood of \(p\) is a set \(N_r(p)\) consisting of all \(q\) such that \(d(p,q)&lt;r\), for some \(r&gt;0\). The \(r\) is called radius of \(N_r(p)\). A point \(p\) is called a limit point of the set \(E\), if every neighborhood of \(p\) contains a point \(q\ne p\) and \(q\in E\). \(E\) is called closed if every limit point of \(E\) is a point of \(E\).</description>
</item>
<item>
<title>Convex sets</title>
<link>/2020/10/31/convex-sets/</link>
<pubDate>Sat, 31 Oct 2020 00:00:00 +0000</pubDate>
<guid>/2020/10/31/convex-sets/</guid>
<description>A set \(C\subseteq\mathbf R^n\) is affine set if for any two distinct points lie in \(C\), \(x_1,x_2\in C\) and \(\theta\in \mathbf R\), the linear combination of these two points lies in \(C\), \(\theta x_1+(1-\theta)x_2\in C\) with the coefficients sum to one. This kind of linear combination is called affine combination. The set of all affine combinations of points in set \(C\subseteq\mathbf R^n\) is called the affine hull of \(C\), and denoted \[\mathbf{\text{aff }}C=\{\theta_1x_1+\cdots+\theta_kx_k|x_1,\cdots,x_k\in C,\theta_1+\cdots+\theta_k=1\}\] The affine hull is the smallest affine set that contains \(C\).</description>
</item>
<item>
<title>Clustering</title>
<link>/2020/10/29/clustering/</link>
<pubDate>Thu, 29 Oct 2020 00:00:00 +0000</pubDate>
<guid>/2020/10/29/clustering/</guid>
<description>Hierarchical Clustering MethodsNonhierarchical Clustering MethodsCorrespondence Analysis
Matrix \(\mathbf X\), with elements \(x_{ij}\), is an two-way \((I\times J=n),i=1,2,\cdots,I;j=1,2,\cdots,J\) contingency table of unscaled frequencies or counts. The matrix of proportions \(\mathbf P=\{p_{ij}\}\) with elements \(p_{ij}=\frac{1}{n}x_{ij}\), is called the The correspondence matrix. The row sums are the vector \[\mathbf r=\{r_{i}=\sum_{j=1}^{J}p_{ij}=\sum_{j=1}^{J}\frac{1}{n}x_{ij}\}\] or \[\underset{(I\times 1)}{\mathbf r}=\underset{(I\times J)}{\mathbf P}\underset{(J\times1)}{\mathbf 1_J}\] The column sums are the vector \[\mathbf c=\{c_{j}=\sum_{i=1}^{I}p_{ij}=\sum_{i=1}^{I}\frac{1}{n}x_{ij}\}\] or \[\underset{(J\times 1)}{\mathbf c}=\underset{(J\times I)}{\mathbf P^T}\underset{(I\times1)}{\mathbf 1_I}\] Let diagonal matrix \[\mathbf D_r=diag(r_1,r_2,\cdots,r_I)\] \[\mathbf D_c=diag(c_1,c_2,\cdots,c_J)\] Correspondence analysis can be formulated as the weighted least squares problem to select matrix \(\hat{\mathbf P}=\{\hat{p}_{ij}\}\), which is specified reduced rank and can minimize the sum of squares \[\sum_{i=1}^{I}\sum_{j=1}^{J}\frac{(p_{ij}-\hat{p}_{ij})^2}{r_ic_j}=tr\Bigl[(\mathbf D_r^{-1/2}(\mathbf P-\hat{\mathbf P})\mathbf D_c^{-1/2})(\mathbf D_r^{-1/2}(\mathbf P-\hat{\mathbf P})\mathbf D_c^{-1/2})^T\Bigr]\] since \((p_{ij}-\hat{p}_{ij})/\sqrt{r_ic_j}\) is the \((i,j)\) element of \(\mathbf D_r^{-1/2}(\mathbf P-\hat{\mathbf P})\mathbf D_c^{-1/2}\) The scaled version of the correspondence matrix \(\mathbf P=\{p_{ij}\}\) is \[\mathbf B=\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}\] the best low \(\text{rank}=s\) approximation \(\hat{\mathbf B}\) to \(\mathbf B\) is given by the first \(s\) terms in the the singular-value decomposition \[\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}=\sum_{k=1}^{J}\widetilde{\lambda}_k\widetilde{\mathbf u}_k\widetilde{\mathbf v}_k^T\] where \[\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}\widetilde{\mathbf v}_k=\widetilde{\lambda}_k\widetilde{\mathbf u}_k\] and \[\widetilde{\mathbf u}_k^T\mathbf D_r^{-1/2}\mathbf P\mathbf D_c^{-1/2}=\widetilde{\lambda}_k\widetilde{\mathbf v}_k^T\] Then the approximation to \(\mathbf P\) is then given by \[\hat{\mathbf P}=\mathbf D_r^{1/2}\hat{\mathbf B}\mathbf D_c^{1/2}\approx\sum_{k=1}^{s}\widetilde{\lambda}_k(\mathbf D_r^{1/2}\widetilde{\mathbf u}_k)(\mathbf D_c^{1/2}\widetilde{\mathbf v}_k)^T\] and the error of approximation is \[\sum_{k=s+1}^{J}\widetilde{\lambda}_k^2\] The term \(\mathbf r\mathbf c^T\) always provides the best rank one approximation to the correspondence matrix \(\mathbf P\), this corresponds to the assumption of independence of the rows and columns.</description>
</item>
<item>
<title>Classification</title>
<link>/2020/10/22/classification/</link>
<pubDate>Thu, 22 Oct 2020 00:00:00 +0000</pubDate>
<guid>/2020/10/22/classification/</guid>
<description>Two classes \(\pi_1\) and \(\pi_2\) have the prior probability \(p_1\) and \(p_2\) separately and \(p_1+p_2=1\). The probabilities of the random variable \(\mathbf x\) belong to the 2 classes follow the density function \(f_1(\mathbf x)\) and \(f_2(\mathbf x)\) over the region \(R_1+R_2\) and \(\underset{R_1}{\int} f_1(\mathbf x)dx=P(1|1)\), \(\underset{R_2}{\int} f_1(\mathbf x)dx=P(2|1)\), \(\underset{R_2}{\int} f_2(\mathbf x)dx=P(2|2)\), \(\underset{R_1}{\int} f_2(\mathbf x)dx=P(1|2)\). Then the probability of observation \(\mathbf x\) which comes from class \(\pi_1\) and is correctly classified as \(\pi_1\) is the conditional probability \[P(\mathbf x\in R_1|\pi_1)P(\pi_1)=P(1|1)p_1\], and observation \(\mathbf x\) is misclassified as \(\pi_1\) is \[P(\mathbf x\in R_1|\pi_2)P(\pi_2)=P(1|2)p_2\] Similarly, observation is correctly classified as \(\pi_2\) is the conditional probability \[P(\mathbf x\in R_2|\pi_2)P(\pi_2)=P(2|2)p_2\], and observation is misclassified as \(\pi_2\) is \[P(\mathbf x\in R_2|\pi_1)P(\pi_1)=P(2|1)p_1\] The costs of misclassification can be defined by a cost matrix \[\begin{array}{cc|cc}&amp;&amp;\text{Classify as:}\\&amp;&amp;\pi_1&amp;\pi_2\\\hline\\\text{True populations:}&amp;\pi_1&amp;0&amp;c(2|1)\\&amp;\pi_2&amp;c(1|2)&amp;0\\\end{array}\] Then the Expected Cost of Misclassification (ECM) is provided by \[\begin{bmatrix}P(2|1)&amp;P(1|2)\\\end{bmatrix}\begin{bmatrix}0&amp;c(2|1)\\c(1|2)&amp;0\\\end{bmatrix}\begin{bmatrix}p_2\\p_1\\\end{bmatrix}=P(1|2)c(1|2)p_2+P(2|1)c(2|1)p_1\] A reasonable classification rule should have an ECM as small as possible.</description>
</item>
<item>
<title>Correlation Analysis</title>
<link>/2020/10/18/correlation-analysis/</link>
<pubDate>Sun, 18 Oct 2020 00:00:00 +0000</pubDate>
<guid>/2020/10/18/correlation-analysis/</guid>
<description>Canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two groups of variables \(\mathbf X\) has \(p\) variables \[\mathbf X=\begin{bmatrix}X_1\\X_2\\\vdots\\X_p\\\end{bmatrix}\] and \(\mathbf Y\) has \(q\) variables \[\mathbf Y=\begin{bmatrix}Y_1\\Y_2\\\vdots\\Y_q\\\end{bmatrix}\] \[E(\mathbf X)=\boldsymbol\mu_X\] \[Cov(\mathbf X)=\boldsymbol\Sigma_{XX}\] and \[E(\mathbf Y)=\boldsymbol\mu_Y\] \[Cov(\mathbf Y)=\boldsymbol\Sigma_{YY}\] and \[Cov(\mathbf X,\mathbf Y)=\boldsymbol\Sigma_{XY}=\boldsymbol\Sigma_{YX}^T=E(\mathbf X-\boldsymbol\mu_X)(\mathbf Y-\boldsymbol\mu_Y)^T=\begin{bmatrix}\sigma_{X_1Y_1}&amp;\sigma_{X_1Y_2}&amp;\cdots&amp;\sigma_{X_1Y_q}\\\sigma_{X_2Y_1}&amp;\sigma_{X_2Y_2}&amp;\cdots&amp;\sigma_{X_2Y_q}\\\vdots&amp;\vdots&amp;\ddots&amp;\vdots\\\sigma_{X_pY_1}&amp;\sigma_{X_pY_2}&amp;\cdots&amp;\sigma_{X_pY_q}\\\end{bmatrix}\] Linear combinations provide simple summary measures of a set of variables.</description>
</item>
<item>
<title>Factor analysis</title>
<link>/2020/10/11/factor-analysis/</link>
<pubDate>Sun, 11 Oct 2020 00:00:00 +0000</pubDate>
<guid>/2020/10/11/factor-analysis/</guid>
<description>Let \(\mathbf X\) is drawn from a \(p\)-variate normal distribution with \(N_p(\boldsymbol\mu, \boldsymbol\Sigma)\) distribution. The matrix of factor loadings \[\mathbf L=\begin{bmatrix}\ell_{11}&amp;\ell_{12}&amp;\cdots&amp;\ell_{1m}\\\ell_{21}&amp;\ell_{22}&amp;\cdots&amp;\ell_{2m}\\\vdots&amp;\vdots&amp;\ddots&amp;\vdots\\\ell_{p1}&amp;\ell_{p2}&amp;\cdots&amp;\ell_{pm}\\\end{bmatrix}\] with \(\ell_{ij}\) is the loading of the \(i^{th}\) variable on the \(j^{th}\) factor.
The common factor is \[\mathbf F=\begin{bmatrix}F_1\\F_2\\\vdots\\F_m\\\end{bmatrix}\] with \(E(\mathbf F)=\underset{(m\times 1)}{\mathbf0}\), \(Var(F_j)=1,\quad (j=1,2,\cdots,m)\) and \(Cov(\mathbf F)=E(\mathbf F\mathbf F^T)=\underset{(m\times m)}{\mathbf I}\) Then the Orthogonal factor model is \[\underset{(p\times1)}{\mathbf X-\boldsymbol\mu}=\underset{(p\times m)}{\mathbf L}\underset{(m\times1)}{\mathbf F}+\underset{(p\times1)}{\boldsymbol\epsilon}\] with \(E(\boldsymbol\epsilon)=\underset{(p\times 1)}{\mathbf0}\) and \[Cov(\boldsymbol\epsilon)=E(\boldsymbol\epsilon\boldsymbol\epsilon^T)=\boldsymbol\Psi=\begin{bmatrix}\psi_1&amp;0&amp;\cdots&amp;0\\0&amp;\psi_2&amp;\cdots&amp;0\\\vdots&amp;\vdots&amp;\ddots&amp;\vdots\\0&amp;0&amp;\cdots&amp;\psi_p\\\end{bmatrix}\] with \(Var(\epsilon_i)=\psi_i\) and \(\mathbf F\) and \(\boldsymbol\epsilon\) are independent with \(Cov(\boldsymbol\epsilon,\mathbf F)=E(\boldsymbol\epsilon\mathbf F^T)=\underset{(p\times m)}{\mathbf0}\)Because \(\mathbf L\) is fixed, then \[\begin{align}\boldsymbol\Sigma=Cov(\mathbf X)&amp;=E(\mathbf X-\boldsymbol\mu)(\mathbf X-\boldsymbol\mu)^T\\&amp;=E(\mathbf L\mathbf F+\boldsymbol\epsilon)(\mathbf L\mathbf F+\boldsymbol\epsilon)^T\\&amp;=E(\mathbf L\mathbf F+\boldsymbol\epsilon)((\mathbf L\mathbf F)^T+\boldsymbol\epsilon^T)\\&amp;=E\Bigl(\mathbf L\mathbf F(\mathbf L\mathbf F)^T+\boldsymbol\epsilon(\mathbf L\mathbf F)^T+\mathbf L\mathbf F\boldsymbol\epsilon^T+\boldsymbol\epsilon\boldsymbol\epsilon^T\Bigr)\\&amp;=\mathbf LE(\mathbf F\mathbf F^T)\mathbf L^T+\mathbf0+\mathbf0+E(\boldsymbol\epsilon\boldsymbol\epsilon^T)\\&amp;=\mathbf L\mathbf L^T+\boldsymbol\Psi\end{align}\] or \[Var(X_i)=\underset{Var(X_i)}{\underbrace{\sigma_{ii}}}=\mathbf L_i\mathbf L_i^T+\psi_i=\underset{\text{communality}}{\underbrace{\ell_{i1}^2+\ell_{i2}^2+\cdots+\ell_{im}^2}}+\underset{\text{specific variance}}{\underbrace{\psi_i}}\] with \(\mathbf L_i\) is the \(i^{th}\) row of \(\mathbf L\) We can denote the \(i^{th}\) communality as \(h_i^2=\ell_{i1}^2+\ell_{i2}^2+\cdots+\ell_{im}^2,\quad (i=1,2,\cdots,p)\), which is the sum of squares of the loadings of the \(i^{th}\) variable on the \(m\) common factors, and the total variance of the \(i^{th}\) variable is the sum of communality and specific variance \(\sigma_{ii}=h_i^2+\psi_i\)\[Cov(X_i,X_k)=E(\mathbf L_i^T\mathbf F+\epsilon_i)(\mathbf L_k^T\mathbf F+\epsilon_k)^T=\mathbf L_i^T\mathbf L_k=\ell_{i1}\ell_{k1}+\ell_{i2}\ell_{k2}+\cdots+\ell_{im}\ell_{km}\]\[Cov(\mathbf X,\mathbf F)=E(\mathbf X-\boldsymbol\mu)\mathbf F^T=E(\mathbf L\mathbf F+\boldsymbol\epsilon)\mathbf F^T=\mathbf LE(\mathbf F\mathbf F^T)+E(\boldsymbol\epsilon\mathbf F^T)=\mathbf L\] or \[Cov(X_i,F_j)=E(X_i-\mu_i)\mathbf F_j^T=E(\mathbf L_i^T\mathbf F+\epsilon_i)\mathbf F_j^T=\ell_{ij}\]</description>
</item>
<item>
<title>Principal Component Analysis</title>
<link>/2020/10/07/principal-component-analysis/</link>
<pubDate>Wed, 07 Oct 2020 00:00:00 +0000</pubDate>
<guid>/2020/10/07/principal-component-analysis/</guid>
<description>Let the random vector \(\mathbf X^T=[X_1,X_2,\cdots,X_p]\) have the covariance matrix \(\boldsymbol\Sigma\) with eigenvalues \(\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0\), the linear combinations \(Y_i=\mathbf a_i^T\mathbf X=a_{i1}X_1+a_{i2}X_2+\cdots+a_{ip}X_p, \quad (i=1,2,\cdots,p)\) has \(Var(Y_i)=Var(\mathbf a_i^T\mathbf X)=\mathbf a_i^TCov(\mathbf X)\mathbf a_i=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i\) and \(Cov(Y_i,Y_k)=Cov(\mathbf a_i^T\mathbf X, \mathbf a_k^T\mathbf X)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_k \quad i,k=1,2,\cdots,p\). The principal components are those uncorrelated linear combinations of \([X_1,X_2,\cdots,X_p]\), \(Y_1,Y_2,\cdots,Y_p\) whose variances \(Var(Y_i)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i\) are as large as possible, subject to \(\mathbf a_i^T\mathbf a_i=1\). These linear combinations represent the selection of a new coordinate system obtained by rotating the original system with \(Y_1,Y_2,\cdots,Y_p\) as the new coordinate axes.</description>
</item>
<item>
<title>Comparisons of several means</title>
<link>/2020/09/29/comparisons-of-several-means/</link>
<pubDate>Tue, 29 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/29/comparisons-of-several-means/</guid>
<description>Paired Comparisons:
If there are \(2\) treatments over multivariate \(\mathbf x_p\), the difference between treatment \(1\) and treatment \(2\) is \(\mathbf d_j=\mathbf x_{j1}-\mathbf x_{j2},\quad j=1,2,\cdots,n\) if \(\mathbf d_j\) are independent \(N_p(\boldsymbol\delta, \mathbf\Sigma_d)\) random vectors, inferencesabout the vector of mean differences \(\boldsymbol\delta\) can be based upon a \(T^2\)-statistic: \(T^2=n(\overline{\mathbf d}-\boldsymbol\delta)^T\mathbf S_d^{-1}(\overline{\mathbf d}-\boldsymbol\delta)\) is distributed as an \(\frac{(n-1)p}{n-p}F_{p,n-p}\) random variable, where \(\overline{\mathbf d}=\displaystyle\frac{1}{n}\displaystyle\sum_{j=1}^{n}\mathbf d_j\) and \(\mathbf S_d=\displaystyle\frac{1}{n-1}\displaystyle\sum_{j=1}^{n}(\mathbf d_j-\overline{\mathbf d})(\mathbf d_j-\overline{\mathbf d})^T\), then an \(\alpha\)-level hypothesis test of \(H_0:\boldsymbol\delta=\mathbf 0\) versus \(H_1:\boldsymbol\delta\ne\mathbf 0\), rejects \(H_0\) if the observed \(T^2=n\overline{\mathbf d}^T\mathbf S_d^{-1}\overline{\mathbf d}&gt;\frac{(n-1)p}{n-p}F_{p,n-p}(\alpha)\).</description>
</item>
<item>
<title>Inferences about the mean</title>
<link>/2020/09/25/inferences-about-the-mean/</link>
<pubDate>Fri, 25 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/25/inferences-about-the-mean/</guid>
<description>The hypothesis testing about the mean is a test of the competing hypotheses: \(H_0:\mu=\mu_0\) and \(H_1:\mu\ne\mu_0\). If \(X_1,X_2,\cdots,X_n\) denote a random sample from a normal population, the appropriate test statistic is \(t=\frac{(\overline X-\mu_0)}{s/\sqrt{n}}\) with \(s^2=\frac{1}{(n-1)}\displaystyle\sum_{i=1}^{n}(X_i-\overline X)^2\). Rejecting \(H_0\) when \(|t|\) is large is equivalent to rejecting \(H_0\) when \(t^2=\frac{(\overline X-\mu_0)^2}{s^2/n}=n(\overline X-\mu_0)(s^2)^{-1}(\overline X-\mu_0)\) is large. Then the test becomes reject \(H_0\) in favor of \(H_1\) at significance level \(\alpha\) if \(n(\overline X-\mu_0)(s^2)^{-1}(\overline X-\mu_0)&gt;t_{n-1}^2(\alpha/2)\), its multivariate analog is \(T^2=(\overline {\mathbf X}-\boldsymbol\mu_0)^T(\frac{1}{n}\mathbf S)^{-1}(\overline {\mathbf X}-\boldsymbol\mu_0)=n(\overline {\mathbf X}-\boldsymbol\mu_0)^T\mathbf S^{-1}(\overline {\mathbf X}-\boldsymbol\mu_0)\), where \(\overline {\mathbf X}=\frac{1}{n}\displaystyle\sum_{j=1}^{n}\mathbf X_j\), \(\underset{(p\times p)}{\mathbf S}=\frac{1}{n-1}\displaystyle\sum_{j=1}^{n}(\underset{(p\times 1)}{\mathbf X_j}-\underset{(p\times 1)}{\overline {\mathbf X}})(\underset{(p\times 1)}{\mathbf X_j}-\underset{(p\times 1)}{\overline {\mathbf X}})^T\)</description>
</item>
<item>
<title>The Multivariate Normal Density</title>
<link>/2020/09/11/the-multivariate-normal-density/</link>
<pubDate>Fri, 11 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/11/the-multivariate-normal-density/</guid>
<description>The univariate normal pdf is:\[f_X(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}, \quad -\infty&lt;x&lt;+\infty\] The term \((\frac{x-\mu}{\sigma})^2=(x-\mu)(\sigma^2)^{-1}(x-\mu)\) measures the square ofthe univariate distance from \(x\) to \(\mu\) in standard deviation units. This can be generalized to a \(p\times 1\) vector \(\mathbf x\) of observations on several variables as \((\mathbf X-\boldsymbol \mu)^T(\mathbf \Sigma)^{-1}(\mathbf X-\boldsymbol \mu)\), which is the square of the multivariate generalized distance from \(\mathbf X\) to \(\boldsymbol \mu\), the \(p\times p\) matrix \(\mathbf \Sigma\) is the variance–covariance matrix of \(\mathbf X\).</description>
</item>
<item>
<title>The Bivariate Normal Distribution</title>
<link>/2020/09/10/the-bivariate-normal-distribution/</link>
<pubDate>Thu, 10 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/10/the-bivariate-normal-distribution/</guid>
<description>The univariate normal pdf is:\[f_Y(y)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{y-\mu}{\sigma})^2}, \quad -\infty&lt;y&lt;+\infty\]
The bivariate normal pdf is, \[f_{X,Y}(x, y)=Ke^{-\frac{1}{2}c(x^2-2\nu xy+y^2)}, \quad -\infty&lt;x, y&lt;+\infty\] where \(c\) and \(\nu\) are constants.\[\begin{align}f_{X,Y}(x, y)&amp;=Ke^{-\frac{1}{2}c(x^2-2\nu xy+y^2)}\\&amp;=Ke^{-\frac{1}{2}c(x^2-\nu^2x^2+\nu^2x^2-2\nu xy+y^2)}\\&amp;=Ke^{-\frac{1}{2}c(x^2-\nu^2x^2)+(\nu x-y)^2}\\&amp;=Ke^{-\frac{1}{2}cx^2(1-\nu^2)}e^{-\frac{1}{2}c(\nu x-y)^2}\\\end{align}\] The exponents must be negative, so \(1-\nu^2&gt;0\).
\[\begin{align}\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}f_{X,Y}(x, y)dxdy&amp;=\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}Ke^{-\frac{1}{2}cx^2(1-\nu^2)}e^{-\frac{1}{2}c(\nu x-y)^2}dxdy\\&amp;=K\int_{-\infty}^{+\infty}e^{-\frac{1}{2}cx^2(1-\nu^2)} \Biggl[\int_{-\infty}^{+\infty}e^{-\frac{1}{2}c(y-\nu x)^2}dy\Biggr]dx\\&amp;=K\int_{-\infty}^{+\infty}e^{-\frac{1}{2}cx^2(1-\nu^2)}\frac{\sqrt{2\pi}}{\sqrt{c}}dx\\&amp;=K\frac{\sqrt{2\pi}}{\sqrt{c}}\frac{\sqrt{2\pi}}{\sqrt{c(1-\nu^2)}}\\&amp;=K\frac{2\pi}{c\sqrt{1-\nu^2}}\\&amp;=1\end{align}\] Then \(K=\frac{c\sqrt{1-\nu^2}}{2\pi}\), if we choose \(c=\frac{1}{1-\nu^2}\), then \(K=\frac{1}{2\pi\sqrt{1-\nu^2}}\) and\[\begin{align}f_{X,Y}(x, y)&amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(x^2-2\nu xy+y^2)}\\&amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(x^2-\nu^2x^2+\nu^2x^2-2\nu xy+y^2)}\\&amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}\end{align}\] The marginal pdfs are sure the standard normal:\[\begin{align}f_{X}(x)&amp;=\int_{-\infty}^{+\infty}f_{X,Y}(x, y)dy\\&amp;=\int_{-\infty}^{+\infty}\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dy\\&amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}\int_{-\infty}^{+\infty}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dy\\&amp;=\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}\sqrt{2\pi}\sqrt{1-\nu^2}\\&amp;=\frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}x^2}\end{align}\] and \(E(X)=E(Y)=0\) and \(\sigma_X=\sigma_Y=1\), then the correlation coefficient between X and Y is:\[\begin{align}\rho(X,Y)&amp;=\frac{Cov(X,Y)}{\sigma_X\sigma_Y}\\&amp;=\frac{E(XY) − E(X)E(Y)}{\sigma_X\sigma_Y}\\&amp;=E(XY)\\&amp;=\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}xyf_{X,Y}(x, y)dxdy\\&amp;=\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}xy\frac{1}{2\pi\sqrt{1-\nu^2}}e^{-\frac{1}{2}x^2}e^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dxdy\\&amp;=\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}xe^{-\frac{1}{2}x^2} \Biggl[\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi(1-\nu^2)}}ye^{-\frac{1}{2}\frac{1}{1-\nu^2}(\nu x-y)^2}dy\Biggr]dx\\&amp;=\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}xe^{-\frac{1}{2}x^2}\nu x dx\\&amp;=\nu\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}x^2e^{-\frac{1}{2}x^2}dx\\&amp;=\nu Var(X)\\&amp;=\nu \sigma_X\\&amp;=\nu\end{align}\] So \(\nu\) is the correlation coefficient between X and Y.</description>
</item>
<item>
<title>Randomized block design</title>
<link>/2020/09/09/randomized-block-design/</link>
<pubDate>Wed, 09 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/09/randomized-block-design/</guid>
<description>In the Randomized block design, all of the sample sizes are the same \(b\), which is the blocks, the mathematical model associated with \(Y_{ij}\) is :\(Y_{ij}=\mu_j+\beta_i+\epsilon_{ij}\), the term \(\beta_i\) represents the effect of the \(i^{th}\) block.\[\begin{array}{|cc|cccc ccc|}\hline&amp;&amp;&amp;\text{treatment}&amp;\text{levels}&amp; &amp; &amp; Block&amp;Block&amp; Block\\&amp; &amp; 1 &amp; 2 &amp; \cdots &amp; k &amp;&amp; Totals &amp; Means &amp; Effects \\\hline&amp;1&amp; Y_{11} &amp; Y_{12} &amp; \cdots &amp; Y_{1k} &amp;&amp; T_{1.</description>
</item>
<item>
<title>Testing Subhypotheses with Contrasts</title>
<link>/2020/09/07/testing-subhypotheses-with-contrasts/</link>
<pubDate>Mon, 07 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/07/testing-subhypotheses-with-contrasts/</guid>
<description>A linear combination of the true means of \(k\) factor levels \(\mu_1,\mu_2,\cdots,\mu_k\) of the randomized-one-factor-design \(C=\displaystyle\sum_{j=1}^{k}c_j\mu_j\) is said to be a contrast if the sum of its coefficients \(\displaystyle\sum_{j=1}^{k}c_j=0\). Because \(\overline Y_{.j}\) is always an unbiased estimator for \(\mu_j\), we can use it to estimate C \(\hat C=\displaystyle\sum_{j=1}^{k}c_j\overline Y_{.j}\). Because \(Y_{ij}\) are normal, so \(\hat C\) is also normal.Then, \(E(\hat C)=\displaystyle\sum_{j=1}^{k}c_jE(\overline Y_{.j})=\displaystyle\sum_{j=1}^{k}c_j\mu_j=C\) and \(Var(\hat C)=\displaystyle\sum_{j=1}^{k}c_j^2Var(\overline Y_{.j})=\displaystyle\sum_{j=1}^{k}c_j^2\frac{\sigma^2}{n_j}=\sigma^2\displaystyle\sum_{j=1}^{k}\frac{c_j^2}{n_j}\). Replacing \(\sigma^2\) by its estimate \(MSE\) gives a formula for the estimated variance \(S_{\hat C}^2=MSE\displaystyle\sum_{j=1}^{k}\frac{c_j^2}{n_j}\).</description>
</item>
<item>
<title>Randomized one-factor design and the analysis of variance (ANOVA)</title>
<link>/2020/09/06/randomized-one-factor-design-and-the-analysis-of-variance-anova/</link>
<pubDate>Sun, 06 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/06/randomized-one-factor-design-and-the-analysis-of-variance-anova/</guid>
<description>If we want to compare the average effects elicited by \(k\) different levels of some given factor, there will be \(k\) independent random samples of sizes \(n_j\quad (j=1,2,...,k)\), the total sample size is \(n=\displaystyle\sum_{j=1}^{k}n_j\). Let \(Y_{ij}\) represent the \(i^{th}\) observation recorded for the \(j^{th}\) level.\[\begin{array}{|c|cccc|}\hline&amp;&amp;\text{treatment}&amp;\text{levels}&amp;\\\hline&amp; 1 &amp; 2 &amp; \cdots &amp; k \\\hline&amp; Y_{11} &amp; Y_{12} &amp; \cdots &amp; Y_{1k} \\&amp; Y_{21} &amp; Y_{22} &amp; \cdots &amp; Y_{2k} \\&amp;\vdots &amp;\vdots &amp;\cdots&amp;\vdots \\&amp;Y_{n_11} &amp;Y_{n_22} &amp;\cdots&amp;Y_{n_kk} \\\text{Sample sizes:}&amp;n_1&amp;n_2&amp;\cdots&amp;n_k\\\text{Sample totals:}&amp;T_{.</description>
</item>
<item>
<title>covariance and correlation coefficient</title>
<link>/2020/09/05/covariance-and-correlation-coefficient/</link>
<pubDate>Sat, 05 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/05/covariance-and-correlation-coefficient/</guid>
<description>We define the covariance of any two random variables \(X\) and \(Y\), written \(Cov(X,Y)\), as: \[\begin{align}Cov(X,Y) &amp;= E(X-\mu_X)(Y-\mu_Y)\\&amp;= E(XY-X\mu_Y-Y\mu_X+\mu_X\mu_Y)\\&amp;= E(XY)-\mu_X\mu_Y-\mu_X\mu_Y+\mu_X\mu_Y\\&amp;= E(XY) - \mu_X\mu_Y\\&amp;= E(XY) − E(X)E(Y)\\\end{align}\].If \(X\) and \(Y\) are independent random variables,\[\begin{align}E(XY)&amp;=\int\int xy\cdot f_{X,Y}(x,y)dxdy\\&amp;=\int\int xy\cdot f_X(x)f_Y(y)dxdy\\&amp;=\int x\cdot f_X(x)dx\int y\cdot f_Y(y)dy\\&amp;=E(X)E(Y)\end{align}\], then \(Cov(X,Y) = E(XY) − E(X)E(Y)=0\)
The Variance of the sum of two random variables \(aX + bY\) is:\[\begin{align}Var(aX + bY) &amp;= E(aX + bY)^2-(E(aX + bY))^2\\&amp;=E(aX + bY)^2-(a\mu_X+b\mu_Y)^2\\&amp;=E(a^2X^2+2aXbY+b^2Y^2)-a^2\mu_X^2-2a\mu_Xb\mu_Y-b^2\mu_Y^2\\&amp;=a^2(E(X^2)-\mu_X^2)+b^2(E(Y^2)-\mu_Y^2)+2ab(E(XY)-\mu_X\mu_Y)\\&amp;=a^2Var(X)+b^2Var(Y)+2abCov(X,Y)\end{align}\].</description>
</item>
<item>
<title>Regression random variable Y for a given value x</title>
<link>/2020/09/04/regression-random-variable-y-for-a-given-value-x/</link>
<pubDate>Fri, 04 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/04/regression-random-variable-y-for-a-given-value-x/</guid>
<description>We want to make regression of a random variable \(Y\) for a given value \(x\), the function \(f_{Y|x}(y)\) denotes the pdf of the random variable \(Y\) for a given value \(x\), and the expected value associated with \(f_{Y|x}(y)\) is \(E(Y | x)\). The function\(y = E(Y | x)\) is called the regression curve of \(Y\) on \(x\). The regression model is called simple linear model if it satisfy the \(4\) assumptions:</description>
</item>
<item>
<title>Linear Regression</title>
<link>/2020/09/03/linear-regression/</link>
<pubDate>Thu, 03 Sep 2020 00:00:00 +0000</pubDate>
<guid>/2020/09/03/linear-regression/</guid>
<description>If there are \(n\) points \((x_1,y_1),(x_2,y_3),...,(x_n,y_n)\), the straight line \(y=a+bx\) minimizing the sum of the squares of the vertical distances from the data points to the line \(L=\sum_{i=1}^{n}(y_i-a-bx_i)^2\), then we take partial derivatives of L with respect to \(a\) and \(b\) and let them equal to \(0\) to get least squares coefficients \(a\) and \(b\):\[\frac{\partial L}{\partial b}=-2\sum_{i=1}^{n}(y_i-a-bx_i)x_i=0\], then \[\sum_{i=1}^{n}x_iy_i=a\sum_{i=1}^{n}x_i+b\sum_{i=1}^{n}x_i^2\]
And, \[\frac{\partial L}{\partial a}=-2\sum_{i=1}^{n}(y_i-a-bx_i)=0\], then\[\sum_{i=1}^{n}y_i=na+b\sum_{i=1}^{n}x_i\]these 2 equations are:\[\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_i &amp; \displaystyle\sum_{i=1}^{n}x_i^2\\n &amp; \displaystyle\sum_{i=1}^{n}x_i\\\end{bmatrix}\begin{bmatrix}a\\b\end{bmatrix}=\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_iy_i\\\displaystyle\sum_{i=1}^{n}y_i\end{bmatrix}\]then, using Cramer’s rule\[\begin{align}b&amp;=\frac{\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_i &amp; \displaystyle\sum_{i=1}^{n}x_iy_i\\n &amp; \displaystyle\sum_{i=1}^{n}y_i\\\end{bmatrix}}{\begin{bmatrix}\displaystyle\sum_{i=1}^{n}x_i &amp; \displaystyle\sum_{i=1}^{n}x_i^2\\n &amp; \displaystyle\sum_{i=1}^{n}x_i\\\end{bmatrix}}\\&amp;=\frac{(\displaystyle\sum_{i=1}^{n}x_i)(\displaystyle\sum_{i=1}^{n}y_i)-n(\displaystyle\sum_{i=1}^{n}x_iy_i)}{(\displaystyle\sum_{i=1}^{n}x_i)^2-n\displaystyle\sum_{i=1}^{n}x_i^2}\\&amp;=\frac{n(\displaystyle\sum_{i=1}^{n}x_iy_i)-(\displaystyle\sum_{i=1}^{n}x_i)(\displaystyle\sum_{i=1}^{n}y_i)}{n\displaystyle\sum_{i=1}^{n}x_i^2-(\displaystyle\sum_{i=1}^{n}x_i)^2}\\&amp;=\frac{(\displaystyle\sum_{i=1}^{n}x_iy_i)-\frac{1}{n}(\displaystyle\sum_{i=1}^{n}x_i)(\displaystyle\sum_{i=1}^{n}y_i)}{\displaystyle\sum_{i=1}^{n}x_i^2-\frac{1}{n}(\displaystyle\sum_{i=1}^{n}x_i)^2}\end{align}\], and, \(a=\frac{\displaystyle\sum_{i=1}^{n}y_i-b\sum_{i=1}^{n}x_i}{n}=\bar y-b\bar x\), which shows point \((\bar x, \bar y)\) is in the line.</description>
</item>
<item>
<title>The square of Student t random variable is a F distribution with with 1 and n df</title>
<link>/2020/08/30/the-square-of-student-t-random-variable-is-a-f-distribution-with-with-1-and-n-df/</link>
<pubDate>Sun, 30 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/30/the-square-of-student-t-random-variable-is-a-f-distribution-with-with-1-and-n-df/</guid>
<description>The Student t ratio with \(n\) degrees of freedom is denoted \(T_n\), where\(T_n=\frac{Z}{\sqrt{\frac{U}{n}}}\), \(Z\) is a standard normal random variable and \(U\) is a \(\chi^2\) random variable independent of \(Z\) with \(n\) degrees of freedom.
Because \(T_n^2= \frac{Z^2}{U/n}\) has an \(F\) distribution with \(1\) and \(n\) df, then,\[f_{T_n^2}(t)=\frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{1}{2})\Gamma(\frac{n}{2})}\frac{n^{\frac{n}{2}}t^{-\frac{1}{2}}}{(n+t)^{\frac{1+n}{2}}},\quad t&gt;0\]
Then,\[\begin{align}f_{T_n}(t)&amp;=\frac{d}{dt}F_{T_n}(t)\\&amp;=\frac{d}{dt}P(T_n\le t)\\&amp;=\frac{d}{dt}(\frac{1}{2}+P(0\le T_n\le t))\\&amp;=\frac{d}{dt}(\frac{1}{2}+\frac{1}{2}P(-t\le T_n\le t))\quad (t&gt;0)\\&amp;=\frac{d}{dt}(\frac{1}{2}+\frac{1}{2}P(T_n^2\le t^2))\\&amp;=\frac{d}{dt}(\frac{1}{2}+\frac{1}{2}F_{T_n^2}(t^2))\\&amp;=t\cdot f_{T_n^2}(t^2)\\&amp;=t\cdot \frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{1}{2})\Gamma(\frac{n}{2})}\frac{n^{\frac{n}{2}}t^{-1}}{(n+t^2)^{\frac{1+n}{2}}}\\&amp;=\frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{1}{2})\Gamma(\frac{n}{2})}\frac{1}{\sqrt{n}}\frac{1}{(1+\frac{t^2}{n})^{\frac{1+n}{2}}}\\&amp;=\frac{\Gamma(\frac{1+n}{2})}{\Gamma(\frac{n}{2})}\frac{1}{\sqrt{n\pi}}\frac{1}{(1+\frac{t^2}{n})^{\frac{1+n}{2}}}\end{align}\]</description>
</item>
<item>
<title>From the Cumulative distribution of standard normal distribution can get the chi square distribution </title>
<link>/2020/08/29/from-the-cumulative-distribution-of-standard-normal-distribution-can-get-the-chi-square-distribution/</link>
<pubDate>Sat, 29 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/29/from-the-cumulative-distribution-of-standard-normal-distribution-can-get-the-chi-square-distribution/</guid>
<description>The Cumulative distribution function of standard Normal distribution in the region \((-x,x),x&gt;0\) is:\[\begin{align}\Phi(x)&amp;=\frac{1}{\sqrt{2\pi}}\int_{-x}^{x} e^{-\frac{1}{2}z^2}dz\\&amp;=\frac{2}{\sqrt{2\pi}}\int_{0}^{x} e^{-\frac{1}{2}z^2}dz\\&amp;=\frac{2}{\sqrt{2\pi}}\int_{0}^{\sqrt{x}} \frac{1}{2\sqrt{u}}e^{-\frac{1}{2}u}du \quad (u=z^2)\\&amp;=\frac{1}{\sqrt{2\pi}}\int_{0}^{\sqrt{x}} \frac{1}{\sqrt{u}}e^{-\frac{1}{2}u}du\\&amp;=\int_{0}^{\sqrt{x}}\frac{(\frac{1}{2})^{\frac{1}{2}}}{\Gamma(\frac{1}{2})}u^{(\frac{1}{2})-1}e^{-\frac{1}{2}u}du\end{align}\]
Here, the integrand\(f_U(u)=\frac{(\frac{1}{2})^{\frac{1}{2}}}{\Gamma(\frac{1}{2})}u^{(\frac{1}{2})-1}e^{-\frac{1}{2}u}\)is a special Gamma distribution with \(r=\frac{1}{2}, \lambda=\frac{1}{2}\). Here, the \(u=z^2\), where the \(z\) is independent standard normal random variable.
And the sum of several \(U=Z^2\) variables\[Y=\sum_{j=1}^{m} U_j=\sum_{j=1}^{m} Z_{j}^{2}\] is still a Gamma distribution:\[f_Y(y)=\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})}y^{(\frac{m}{2})-1}e^{-\frac{1}{2}y}\],and we give the special Gamma distribution with \(r=\frac{m}{2}, \lambda=\frac{1}{2}\) a new name: \(\chi^2\) distribution with \(m\) degrees of freedom.</description>
</item>
<item>
<title>Ratio of 2 independent chi square random variables divided by their degrees of freedom is F distribution</title>
<link>/2020/08/29/ratio-of-2-independent-chi-square-random-variables-divided-by-their-degrees-of-freedom-is-f-distribution/</link>
<pubDate>Sat, 29 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/29/ratio-of-2-independent-chi-square-random-variables-divided-by-their-degrees-of-freedom-is-f-distribution/</guid>
<description>When V and U are two \(\chi^2\) independent random variables: \(f_V(v)=\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})}v^{(\frac{m}{2})-1}e^{-\frac{1}{2}v}\)
\(f_U(u)=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}u^{(\frac{n}{2})-1}e^{-\frac{1}{2}u}\)
with \(m\) and \(n\) degrees of freedom, then, the pdf for \(W=V/U\) is:
\[\begin{align}f_{V/U}(\omega)&amp;=\int_{0}^{+\infty}|u|f_U(u)f_V(u\omega)du\\&amp;=\int_{0}^{+\infty}u\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}u^{\frac{n}{2}-1}e^{-\frac{1}{2}u} \frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})}(u\omega)^{\frac{m}{2}-1}e^{-\frac{1}{2}u\omega}du\\&amp;=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})} \omega^{\frac{m}{2}-1} \int_{0}^{+\infty}u^{\frac{n}{2}}u^{\frac{m}{2}-1} e^{-\frac{1}{2}u(1+\omega)}du\\&amp;=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})} \omega^{\frac{m}{2}-1} \int_{0}^{+\infty}u^{\frac{n+m}{2}-1} e^{-\frac{1}{2}u(1+\omega)}du\\&amp;=\frac{(\frac{1}{2})^{\frac{n}{2}}}{\Gamma(\frac{n}{2})}\frac{(\frac{1}{2})^{\frac{m}{2}}}{\Gamma(\frac{m}{2})} \omega^{\frac{m}{2}-1} (\frac{\Gamma(\frac{n+m}{2})}{(\frac{1}{2}(1+\omega))^{\frac{n+m}{2}}})\\&amp;=\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{\omega^{\frac{m}{2}-1}}{(1+\omega)^{\frac{n+m}{2}}}\end{align}\]
Then, the pdf for \(W=\frac{V/m}{U/n}\) is:\[\begin{align}f_{\frac{V/m}{U/n}}&amp;=f_{\frac{n}{m}V/U}\\&amp;=\frac{m}{n}f_{V/U}(\frac{m}{n}\omega)\\&amp;=\frac{m}{n}\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{(\frac{m}{n}\omega)^{\frac{m}{2}-1}}{(1+\frac{m}{n}\omega)^{\frac{n+m}{2}}}\\&amp;=\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{m}{n}\frac{(\frac{m}{n}\omega)^{\frac{m}{2}-1}}{(n+m\omega)^{\frac{n+m}{2}}}n^{\frac{n+m}{2}}\\&amp;=\frac{\Gamma(\frac{n+m}{2})}{\Gamma(\frac{n}{2})\Gamma(\frac{m}{2})}\frac{m^{\frac{m}{2}}n^{\frac{n}{2}}\omega^{\frac{m}{2}-1}}{(n+m\omega)^{\frac{n+m}{2}}}\end{align}\], which is a \(F\) distribution with \(m\) and \(n\) degrees of freedom.</description>
</item>
<item>
<title>Geometric Distribution is the first success occurs on kth Bernoulli trial, Negative Binomial is the rth success occurs on kth Bernoulli trial</title>
<link>/2020/08/25/geometric-distribution-is-the-first-success-occurs-on-kth-bernoulli-trial-negative-binomial-is-the-rth-success-occurs-on-kth-bernoulli-trial/</link>
<pubDate>Tue, 25 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/25/geometric-distribution-is-the-first-success-occurs-on-kth-bernoulli-trial-negative-binomial-is-the-rth-success-occurs-on-kth-bernoulli-trial/</guid>
<description>The Geometric variable X has a pdf like this:\[P_X(k)=P(X=k)=(1-p)^{k-1}p, \quad k=1,2,3,..\]
The moment-generating function for a Geometric random variable X is:\[\begin{align}M_X(t)=E(e^{tX})&amp;=\sum_{all\ k}e^{tk}(1-p)^{k-1}p\\&amp;=\frac{p}{1-p}\sum_{all\ k}(e^t(1-p))^{k}\\&amp;=\frac{p}{1-p}(\frac{1}{1-e^t(1-p)}-1)\\&amp;=\frac{pe^t}{1-(1-p)e^t}\end{align}\]
The expected value is:\[\begin{align}M_X^{(1)}(t)&amp;=\frac{d}{dt}\frac{pe^t}{1-(1-p)e^t}\\&amp;=\frac{pe^t}{1-(1-p)e^t}+\frac{pe^t(1-p)e^t}{(1-(1-p)e^t)^2}\Bigl|_{t=0}\\&amp;=1+\frac{p(1-p)}{p^2}\\&amp;=\frac{1}{p}\end{align}\]
\[\begin{align}M_X^{(2)}(t)&amp;=\frac{d}{dt}\Bigl(\frac{pe^t}{1-(1-p)e^t}+\frac{pe^t(1-p)e^t}{(1-(1-p)e^t)^2}\Bigr)\\&amp;=\frac{pe^t}{1-(1-p)e^t}+\frac{pe^t(1-p)e^t}{(1-(1-p)e^t)^2}+\frac{2pe^{2t}(1-p)}{(1-(1-p)e^t)^2}+\frac{2pe^{3t}(1-p)^2}{(1-(1-p)e^t)^3}\Biggl|_{t=0}\\&amp;=1+(1/p-1)+2(1/p-1)+2(1/p-1)^2\\&amp;=2/p^2-1/p\end{align}\]
Then, the Variance is:\(Var(X)=E(X^2)-(E(X))^2=2/p^2-1/p-1/p^2=1/p^2-1/p=\frac{1-p}{p^2}\)
Negative Binomial is the rth success occurs on kth Bernoulli trialThe Negative Binomial variable Y has a pdf like this:\[P_Y(k)=P(Y=k)=\binom{k-1}{r-1}p^r(1-p)^{k-r}, \quad k=r,r+1,r+2,.</description>
</item>
<item>
<title>Exponential distribution is interval between consecutive Poisson events</title>
<link>/2020/08/24/exponential-distribution-is-interval-between-consecutive-poisson-events/</link>
<pubDate>Mon, 24 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/24/exponential-distribution-is-interval-between-consecutive-poisson-events/</guid>
<description>Let’s denote the interval between consecutive Poisson events with random variable Y, during the interval that extends from a to a + y, the number of Poisson events k has the probability \(P(k)=e^{-\lambda y} \frac{(\lambda y)^k}{k!}\), if \(k=0\),\(e^{-\lambda y}\frac{(\lambda y)^0}{0!}=e^{-\lambda y}\) means there is no event during the (a,a+y) time period.
Because there will be no occurrences in the interval (a, a + y) only if Y &gt; y,so \(P(Y &gt; y)=e^{-\lambda y}\), then the cdf is \(F_Y(y)=P(Y \le y)=1-P(Y &gt; y)=1-e^{-\lambda y}\).</description>
</item>
<item>
<title>Poisson is a limit of Binomial when n goes to infinity with np maintained</title>
<link>/2020/08/24/poisson-is-a-limit-of-binomial-when-n-goes-to-infinity-with-np-maintained/</link>
<pubDate>Mon, 24 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/24/poisson-is-a-limit-of-binomial-when-n-goes-to-infinity-with-np-maintained/</guid>
<description>The binomial random variable has a pdf like this:\(P_X(k)=\binom{n}{k}p^k(1-p)^{n-k},\quad k=0,1,2,...,n\)Its moment-generating function is:\[\begin{align}M_X(t)=E(e^{tX})&amp;=\sum_{k=0}^{n}e^{tk}\binom{n}{k}p^k(1-p)^{n-k}\\&amp;=\sum_{k=0}^{n}\binom{n}{k}(e^tp)^k(1-p)^{n-k}\\&amp;=(1-p+pe^t)^n\end{align}\]
Then \(M_X^{(1)}(t)=n(1-p+pe^t)^{n-1}pe^t|_{t=0}=np=E(X)\)\[\begin{align}M_X^{(2)}(t)&amp;=n(n-1)(1-p+pe^t)^{n-2}pe^tpe^t+n(1-p+pe^t)^{n-1}pe^t|_{t=0}\\&amp;=n(n-1)p^2+np=E(X^2)\end{align}\]
Then \(Var(X)=E(X^2)-(E(X))^2=n(n-1)p^2+np-(np)^2=-np^2+np=np(1-p)\)
For the binomial random variable X:\(P_X(k)=\binom{n}{k}p^k(1-p)^{n-k},\quad k=0,1,2,...,n\), if \(n\to+\infty\) with \(\lambda=np\) remains constant, then\[\begin{align}\lim_{n\to+\infty}\binom{n}{k}p^k(1-p)^{n-k}&amp;=\lim_{n\to+\infty}\frac{n!}{k!(n-k)!}(\frac{\lambda}{n})^k(1-\frac{\lambda}{n})^{n-k}\\&amp;=\lim_{n\to+\infty}\frac{n!}{k!(n-k)!}\lambda^k(\frac{1}{n})^k(1-\frac{\lambda}{n})^n(1-\frac{\lambda}{n})^{-k}\\&amp;=\frac{\lambda^k}{k!}\lim_{n\to+\infty}\frac{n!}{(n-k)!}(\frac{1}{n})^k(\frac{n}{n-\lambda})^k(1-\frac{\lambda}{n})^n\\&amp;=e^{-\lambda}\frac{\lambda^k}{k!}\lim_{n\to+\infty}\frac{n!}{(n-k)!}(\frac{1}{n-\lambda})^k\\&amp;=e^{-\lambda}\frac{\lambda^k}{k!}\lim_{n\to+\infty}\frac{n(n-1)...(n-k+1)}{(n-\lambda)(n-\lambda)...(n-\lambda)}\\&amp;=e^{-\lambda}\frac{\lambda^k}{k!}\end{align}\]
The moment-generating function of Poisson random variable X is:\[\begin{align}M_X(t)=E(e^{tX})&amp;=\sum_{k=0}^{n}e^{tk}e^{-\lambda}\frac{\lambda^k}{k!}\\&amp;=e^{-\lambda}\sum_{k=0}^{n}\frac{(\lambda e^t)^k}{k!}\\&amp;=e^{-\lambda}e^{\lambda e^t}\\&amp;=e^{\lambda e^t-\lambda}\end{align}\]</description>
</item>
<item>
<title>The Gamma random variable denotes the waiting time for a Poisson event also the sum of Exponential events</title>
<link>/2020/08/24/the-gamma-random-variable-denotes-the-waiting-time-for-the-rth-poisson-event/</link>
<pubDate>Mon, 24 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/24/the-gamma-random-variable-denotes-the-waiting-time-for-the-rth-poisson-event/</guid>
<description>The Gamma random variable denotes the waiting time for the \(r^{th}\) Poisson event, and also denotes the sum of r Exponential random variables. The sum of m Gamma random variables (shared the same parameter \(\lambda\)) is a Gamma random variable, which denotes the waiting time for the \((\sum_{i=1}^{m} r_i)^{th}\) Poisson event, and also denotes the sum of \(\sum_{i=1}^{m} r_i\) Exponential random variables.
let Y denote the waiting time to the occurrence of the \(r^{th}\) Poisson event,the probability fewer than r Poisson events occur in [0, y] time period is\(P(Y&gt;y)=\sum_{k=0}^{r-1}e^{-\lambda y}\frac{(\lambda y)^k}{k!</description>
</item>
<item>
<title>The Gamma and Beta functions</title>
<link>/2020/08/21/the-gamma-and-beta-functions/</link>
<pubDate>Fri, 21 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/21/the-gamma-and-beta-functions/</guid>
<description>The Gamma function:\[\Gamma(s)=\int_{0}^{+\infty}t^{s-1}e^{-t}dt\quad \Bigl(=(s-1)! \quad s\in \mathbb Z^+\Bigr) (0&lt;s&lt;\infty)\] Because \[\begin{align}\Gamma(s+1)&amp;=\int_{0}^{+\infty}t^{s}e^{-t}dt\\&amp;=-\int_{0}^{+\infty}t^{s}d(e^{-t})\\&amp;=-\Biggl[t^{s}e^{-t}|_{0}^{\infty}-\int_{0}^{+\infty}st^{s-1}e^{-t}dt\Biggl]\\&amp;=-\Biggl[0-s\Gamma(s)\Biggl]\\&amp;=s\Gamma(s)\end{align}\] and \[\Gamma(1)=\int_{0}^{+\infty}t^{1-1}e^{-t}dt=\int_{0}^{+\infty}e^{-t}dt=1\]The product of two Gamma functions:\[\begin{align}\Gamma(x)\Gamma(y)&amp;=\int_{0}^{+\infty}u^{x-1}e^{-u}du\int_{0}^{+\infty}v^{y-1}e^{-v}dv\\&amp;=\int_{u=0}^{+\infty}\int_{v=0}^{+\infty}e^{-(u+v)}u^{x-1}v^{y-1}dudv \quad (let\quad u+v=z; \quad u/z=t; \quad v/z=1-t; \quad dudv=zdtdz)\\&amp;=\int_{z=0}^{+\infty}\int_{t=0}^{t=1}e^{-z}(zt)^{x-1}(z(1-t))^{y-1}zdtdz\\&amp;=\int_{z=0}^{+\infty}e^{-z}z^{(x+y-1)}dz\int_{t=0}^{t=1}t^{(x-1)}(1-t)^{(y-1)}dt\\&amp;=\Gamma(x+y)\int_{t=0}^{t=1}t^{(x-1)}(1-t)^{(y-1)}dt\end{align}\]
We define this integral \(\int_{t=0}^{t=1}t^{(x-1)}(1-t)^{(y-1)}dt\) as \(B(x,y),\quad (x&gt;0 ;\quad y&gt;0)\), this is the Beta function.\(B(x,y)=\frac{\Gamma(x)\Gamma(y)}{\Gamma(x+y)}\) \(\Bigl(=\frac{(x-1)!(y-1)!}{(x+y-1)!}\quad x;y\in \mathbb Z^+ \Bigr)\) this is the complete Beta function.</description>
</item>
<item>
<title>How to derive the beautiful probability density function (pdf) of Normal Distribution?</title>
<link>/2020/08/13/distributions/</link>
<pubDate>Thu, 13 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/13/distributions/</guid>
<description>How can we derive the probability density function (pdf) of Normal Distribution?\[f_Y(y)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{y-\mu}{\sigma})^2}, \quad -\infty&lt;y&lt;+\infty\]
Let’s draw a normal pdf first
#draw normal pdfx &lt;- seq(-5, 5, length.out = 201); dx &lt;- diff(x)[1]y &lt;- dnorm(x, mean = 0, sd = 1)base::plot(x, y, type = &quot;l&quot;, col = &quot;skyblue&quot;,xlab=&quot;x&quot; , ylab=&quot;p(x)&quot; , cex.lab=1.5,main=&quot;Normal Probability Density&quot; , cex.main=1.5,lwd=2)text( 0, .6*max(y) , bquote( paste(mu ,&quot; = 0 &quot;) ), cex=1.</description>
</item>
<item>
<title>Running wsl commands using system2() function in R</title>
<link>/2020/08/12/running-wsl-commands-using-system2-function-in-r/</link>
<pubDate>Wed, 12 Aug 2020 00:00:00 +0000</pubDate>
<guid>/2020/08/12/running-wsl-commands-using-system2-function-in-r/</guid>
<description>Accession number NC_045512 in Fasta format.Using “wsl” command in system2() to run commands in wsl
system2(&quot;wsl&quot;, &quot;cd ~/bioinfor/; ls&quot;, stdout = TRUE)## [1] &quot;AF086833.gb&quot; &quot;NC_045512-version1.fa&quot; &quot;RNASeqByExample&quot; ## [4] &quot;chr22.fa&quot; &quot;runinfo.csv&quot;We can retrieve the SARS-coronavirus 2 gene sequences using efetch
system2(&quot;wsl&quot;,&quot;efetch -db=nuccore -format=gb -id=NC_045512&quot;, stdout = &quot;../../../NC_045512.gb&quot;)Accession number NC_045512 in Fasta format.system2(&quot;wsl&quot;,&quot;efetch -db=nuccore -format=fasta -id=NC_045512 &gt; NC_045512.fa&quot;, stdout = TRUE)## character(0)system2(&quot;wsl&quot;, &quot;cat .</description>
</item>
<item>
<title>About this site</title>
<link>/about/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
<guid>/about/</guid>
<description>This website was constructed using R markdown and R blogdown, the wonderful packages by Yihui Xie, and the Hugo Lithium theme. These pages are hosted by GitHub Pages.
Interesting blogs which I followed:
colahChris ChoyTerence TaoPiotr MigdałLak Lakshmanan1. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021. https://www.R-project.org/.2. Xie Y, Hill AP, Thomas A.</description>
</item>
<item>
<title>Curriculum Vitae</title>
<link>/vitae/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
<guid>/vitae/</guid>
<description>Contact Information– Email: dan@danli.org;
– Homepage:http://danli.org/;
– Orcid:orcid;
– Github:https://github.com/danli349;
– Twitter:@LiDan;
– StackExchange:StackExchange;
– Biostars: Biostars;</description>
</item>
</channel>
</rss>