Regresi - Case Performa Akademik [2]
Tulisan ini adalah tulisan lanjutan dari part 1.
Pada part 1, kita sudah medapatkan kesimpulan dari Hipotesis (Ekspektasi) kita bahwa:
"Jumlah siswa per kelas yang kecil berkorelasi kepada peningkatan Performa akademik sekolah, semakin sedikit siswa yang menerima makanan gratis (jumlah siswa miskin semakin sedikit) maka berkorelasi kepada peningkatan Performa akademik sekolah, dan persentasi guru yang terakreditasi secara full, tidak berkorelasi dengan Performa akademik sekolah"
Tapi apakah kesimpulan ini valid, dan layak untuk dipublish?
Belum tentu, valid!
Sebab kita belum melakukan "pembersihan" data.
Kesalahan umum terjadi sewaktu data di input. Bisa terjadi salah format, atau tidak sengaja kepencet karakter -, sehingga angka positif terinput jadi angka bernilai negatif, dan kadang ada juga data yang kosong. Semua ini akan berpengaruh kepada hasil.
Bagaimana cara "membersihkan" data tsb?
Dalam artikel ini dijelaskan tahap-tahap nya.
1. Langkah pertama bisa dicoba untuk memahami data set
. describe Contains data from E:\Latihan\elemapi.dta obs: 400 vars: 21 18 Oct 2020 19:23 size: 13,200
------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- snum int %9.0g school number dnum int %7.0g dname district number api00 int %6.0g api 2000 api99 int %6.0g api 1999 growth int %6.0g growth 1999 to 2000 meals byte %4.0f pct free meals ell byte %4.0f english language learners yr_rnd byte %4.0f yr_rnd year round school mobility byte %4.0f pct 1st year in school acs_k3 byte %4.0f avg class size k-3 acs_46 byte %4.0f avg class size 4-6 not_hsg byte %4.0f parent not hsg hsg byte %4.0f parent hsg some_col byte %4.0f parent some college col_grad byte %4.0f parent college grad grad_sch byte %4.0f parent grad school avg_ed float %9.0g avg parent ed full float %4.0f pct full credential emer byte %4.0f pct emer credential enroll int %9.0g number of students mealcat byte %18.0g mealcat Percentage free meals in 3 categories ------------------------------------------------------------------------------- Sorted by: dnum
. list api00 meals yr_rnd acs_k3 full in 1/20
+------------------------------------------+
| api00 meals yr_rnd acs_k3 full |
|------------------------------------------|
1. | 693 67 No 16 76.00 |
2. | 570 92 No 15 79.00 |
3. | 546 97 No 17 68.00 |
4. | 571 90 No 20 87.00 |
5. | 478 89 No 18 87.00 |
|------------------------------------------|
6. | 858 . No 20 100.00 |
7. | 918 . No 19 100.00 |
8. | 831 . No 20 96.00 |
9. | 860 . No 20 100.00 |
10. | 737 29 No 21 96.00 |
|------------------------------------------|
11. | 851 . No 20 100.00 |
12. | 536 71 No 21 100.00 |
13. | 847 . No 20 97.00 |
14. | 765 13 No 21 98.00 |
15. | 809 . No 20 89.00 |
|------------------------------------------|
16. | 813 . No 21 100.00 |
17. | 856 . No 21 100.00 |
18. | 712 40 Yes 19 85.00 |
19. | 805 . No 21 93.00 |
20. | 678 37 Yes 20 83.00 |
+------------------------------------------+
. summarize api00 acs_k3 meals full yr_rnd Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- api00 | 400 647.6225 142.249 369 940 acs_k3 | 398 18.54774 5.004933 -21 25 meals | 315 71.99365 24.38557 6 100 full | 400 66.0568 40.29793 .42 100 yr_rnd | 400 .23 .4213595 0 1
. tabulate acs_k3 avg class | size k-3 | Freq. Percent Cum. ------------+----------------------------------- -21 | 3 0.75 0.75 -20 | 2 0.50 1.26 -19 | 1 0.25 1.51 14 | 2 0.50 2.01 15 | 1 0.25 2.26 16 | 14 3.52 5.78 17 | 20 5.03 10.80 18 | 64 16.08 26.88 19 | 143 35.93 62.81 20 | 97 24.37 87.19 21 | 40 10.05 97.24 22 | 7 1.76 98.99 23 | 3 0.75 99.75 25 | 1 0.25 100.00 ------------+----------------------------------- Total | 398 100.00
. tabulate dnum if acs_k3 < 0
district |
number | Freq. Percent Cum.
------------+-----------------------------------
140 | 6 100.00 100.00
------------+-----------------------------------
Total | 6 100.00
. replace acs_k3=19 if acs_k3==-19 (1 real change made) . replace acs_k3=20 if acs_k3==-20 (2 real changes made) . replace acs_k3=21 if acs_k3==-21 (3 real changes made) . tabulate acs_k3 avg class | size k-3 | Freq. Percent Cum. ------------+----------------------------------- 14 | 2 0.50 0.50 15 | 1 0.25 0.75 16 | 14 3.52 4.27 17 | 20 5.03 9.30 18 | 64 16.08 25.38 19 | 144 36.18 61.56 20 | 99 24.87 86.43 21 | 43 10.80 97.24 22 | 7 1.76 98.99 23 | 3 0.75 99.75 25 | 1 0.25 100.00 ------------+----------------------------------- Total | 398 100.00
. tabulate full pct full | credential | Freq. Percent Cum. ------------+----------------------------------- 0.42 | 1 0.25 0.25 0.45 | 1 0.25 0.50 0.46 | 1 0.25 0.75 0.47 | 1 0.25 1.00 0.48 | 1 0.25 1.25 0.50 | 3 0.75 2.00 0.51 | 1 0.25 2.25 0.52 | 1 0.25 2.50 0.53 | 1 0.25 2.75 0.54 | 1 0.25 3.00 0.56 | 2 0.50 3.50 0.57 | 2 0.50 4.00 0.58 | 1 0.25 4.25 0.59 | 3 0.75 5.00 0.60 | 1 0.25 5.25 0.61 | 4 1.00 6.25 0.62 | 2 0.50 6.75 0.63 | 1 0.25 7.00 0.64 | 3 0.75 7.75 0.65 | 3 0.75 8.50 0.66 | 2 0.50 9.00 0.67 | 6 1.50 10.50 0.68 | 2 0.50 11.00 0.69 | 3 0.75 11.75 0.70 | 1 0.25 12.00 0.71 | 1 0.25 12.25 0.72 | 2 0.50 12.75 0.73 | 6 1.50 14.25 0.75 | 4 1.00 15.25 0.76 | 2 0.50 15.75 0.77 | 2 0.50 16.25 0.79 | 3 0.75 17.00 0.80 | 5 1.25 18.25 0.81 | 8 2.00 20.25 0.82 | 2 0.50 20.75 0.83 | 2 0.50 21.25 0.84 | 2 0.50 21.75 0.85 | 3 0.75 22.50 0.86 | 2 0.50 23.00 0.90 | 3 0.75 23.75 0.92 | 1 0.25 24.00 0.93 | 1 0.25 24.25 0.94 | 2 0.50 24.75 0.95 | 2 0.50 25.25 0.96 | 1 0.25 25.50 1.00 | 2 0.50 26.00 37.00 | 1 0.25 26.25 41.00 | 1 0.25 26.50 44.00 | 2 0.50 27.00 45.00 | 2 0.50 27.50 46.00 | 1 0.25 27.75 48.00 | 1 0.25 28.00 53.00 | 1 0.25 28.25 57.00 | 1 0.25 28.50 58.00 | 3 0.75 29.25 59.00 | 1 0.25 29.50 61.00 | 1 0.25 29.75 63.00 | 2 0.50 30.25 64.00 | 1 0.25 30.50 65.00 | 1 0.25 30.75 68.00 | 2 0.50 31.25 69.00 | 3 0.75 32.00 70.00 | 1 0.25 32.25 71.00 | 3 0.75 33.00 72.00 | 1 0.25 33.25 73.00 | 2 0.50 33.75 74.00 | 1 0.25 34.00 75.00 | 4 1.00 35.00 76.00 | 4 1.00 36.00 77.00 | 2 0.50 36.50 78.00 | 4 1.00 37.50 79.00 | 3 0.75 38.25 80.00 | 10 2.50 40.75 81.00 | 4 1.00 41.75 82.00 | 3 0.75 42.50 83.00 | 9 2.25 44.75 84.00 | 4 1.00 45.75 85.00 | 8 2.00 47.75 86.00 | 5 1.25 49.00 87.00 | 12 3.00 52.00 88.00 | 6 1.50 53.50 89.00 | 5 1.25 54.75 90.00 | 9 2.25 57.00 91.00 | 8 2.00 59.00 92.00 | 7 1.75 60.75 93.00 | 12 3.00 63.75 94.00 | 10 2.50 66.25 95.00 | 17 4.25 70.50 96.00 | 17 4.25 74.75 97.00 | 11 2.75 77.50 98.00 | 9 2.25 79.75 100.00 | 81 20.25 100.00 ------------+----------------------------------- Total | 400 100.00
district |
number | Freq. Percent Cum.
------------+-----------------------------------
401 | 104 100.00 100.00
------------+-----------------------------------
Total | 104 100.00
. summarize full Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- full | 400 84.55 14.94979 37 100
use https://stats.idre.ucla.edu/stat/stata/webbooks/reg/elemapi2
regress api00 acs_k3 meals full
Source | SS df MS Number of obs = 398
-------------+------------------------------ F( 3, 394) = 615.55
Model | 6604966.18 3 2201655.39 Prob > F = 0.0000
Residual | 1409240.96 394 3576.7537 R-squared = 0.8242
-------------+------------------------------ Adj R-squared = 0.8228
Total | 8014207.14 397 20186.9197 Root MSE = 59.806
------------------------------------------------------------------------------
api00 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
acs_k3 | -.7170622 2.238821 -0.32 0.749 -5.118592 3.684468
meals | -3.686265 .1117799 -32.98 0.000 -3.906024 -3.466505
full | 1.327138 .2388739 5.56 0.000 .857511 1.796765
_cons | 771.6581 48.86071 15.79 0.000 675.5978 867.7184
-----------------------------------------------------------------------------
Comments
Post a Comment