| Animal | Length | Height | |
|---|---|---|---|
| 166 | Penguin | 0.583194 | 0.513319 |
| 26 | Tiger | 2.666962 | 0.948518 |
| 126 | Panda | 1.564973 | 0.941738 |
| 135 | Koala | 0.700277 | 0.628630 |
| 167 | Penguin | 0.779317 | 0.736924 |
| 18 | Lion | 2.398403 | 1.299864 |
| 81 | Zebra | 2.334960 | 1.474734 |
| 71 | Giraffe | 2.843951 | 5.444100 |
| Animal | Length | Height | |
|---|---|---|---|
| 166 | Penguin | 0.583194 | 0.513319 |
| 26 | Tiger | 2.666962 | 0.948518 |
| 126 | Panda | 1.564973 | 0.941738 |
| 135 | Koala | 0.700277 | 0.628630 |
| 167 | Penguin | 0.779317 | 0.736924 |
| 18 | Lion | 2.398403 | 1.299864 |
| 81 | Zebra | 2.334960 | 1.474734 |
| 71 | Giraffe | 2.843951 | 5.444100 |



\[H(L, N) = \beta_0 + \beta_1 L\]
| H | Høyde |
| L | Lengde |
| \(\beta_i\) | Regresjonskoeffisienter |
| Animal | Length | Height | Animal_Code | |
|---|---|---|---|---|
| 147 | Koala | 0.629912 | 0.598460 | 3 |
| 10 | Lion | 2.308749 | 1.216917 | 4 |
| 139 | Koala | 0.879736 | 0.641603 | 3 |
| 169 | Penguin | 0.798349 | 0.566067 | 7 |
| 93 | Zebra | 2.432375 | 1.364283 | 9 |
| 162 | Penguin | 0.621145 | 0.546911 | 7 |
| 76 | Giraffe | 2.867906 | 5.195305 | 1 |
| 87 | Zebra | 2.150662 | 1.417177 | 9 |
| 99 | Zebra | 2.434211 | 1.584609 | 9 |
| 46 | Tiger | 2.883302 | 1.042491 | 8 |


\[H(L, N) = \beta_0 + \beta_1 L + \beta_2 N\]
| Symbol | Beskrivelse |
|---|---|
| H | Høyde |
| L | Lengde |
| N | Numerisk kode for dyret |
| \(\beta_i\) | Regresjonskoeffisienter |
Vi kan lage one-hot-kodet data med pandas.get_dummies(...)
| Length | Height | Animal_Code | Animal_Elephant | Animal_Giraffe | Animal_Kangaroo | Animal_Koala | Animal_Lion | Animal_Ostrich | Animal_Panda | Animal_Penguin | Animal_Tiger | Animal_Zebra | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 117 | 1.412142 | 1.678782 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 66 | 2.943133 | 5.606900 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 20 | 2.383274 | 1.096715 | 4 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 134 | 1.681995 | 1.194429 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 18 | 2.398403 | 1.299864 | 4 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 140 | 0.996673 | 0.844714 | 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 162 | 0.621145 | 0.546911 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 75 | 3.037900 | 5.576003 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 116 | 1.694217 | 1.821706 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 68 | 3.184983 | 5.639997 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

\[H(L, N) = \beta_0 + \beta_1 L + \sum_{\mathrm{i = \{Lion, Tiger, ...\}}}^{k} \beta_i [\text{er dette en }i\mathrm{?}]\]
import pandas as pd
import seaborn as sns
health = sns.load_dataset('healthexp')
display(health.sample(10))| Year | Country | Spending_USD | Life_Expectancy | |
|---|---|---|---|---|
| 201 | 2008 | USA | 7385.026 | 78.1 |
| 261 | 2018 | USA | 10451.386 | 78.7 |
| 232 | 2014 | Canada | 4536.810 | 81.8 |
| 22 | 1975 | USA | 560.750 | 72.7 |
| 116 | 1994 | Japan | 1420.271 | 79.8 |
| 95 | 1991 | Canada | 1805.209 | 77.6 |
| 93 | 1990 | Japan | 1088.959 | 78.9 |
| 114 | 1994 | France | 1817.042 | 78.0 |
| 86 | 1989 | Great Britain | 739.714 | 75.4 |
| 170 | 2003 | Japan | 2194.437 | 81.8 |
Her bruker vi seaborn kun for å laste inn et datasett. Seaborn gir oss også noen muligheter til pen visualisering i statistikk, for dem som måtte være interessert i det.
Note
import pandas as pd
import seaborn as sns
health = sns.load_dataset('healthexp')
health_onehot = pd.get_dummies(health, columns=['Country'])
display(health_onehot.sample(10))| Year | Spending_USD | Life_Expectancy | Country_Canada | Country_France | Country_Germany | Country_Great Britain | Country_Japan | Country_USA | |
|---|---|---|---|---|---|---|---|---|---|
| 69 | 1986 | 1278.816 | 76.5 | 1 | 0 | 0 | 0 | 0 | 0 |
| 192 | 2007 | 3588.227 | 81.2 | 0 | 1 | 0 | 0 | 0 | 0 |
| 97 | 1991 | 842.797 | 75.9 | 0 | 0 | 0 | 1 | 0 | 0 |
| 229 | 2013 | 3667.636 | 81.1 | 0 | 0 | 0 | 1 | 0 | 0 |
| 105 | 1992 | 3100.343 | 75.7 | 0 | 0 | 0 | 0 | 0 | 1 |
| 39 | 1980 | 659.826 | 74.3 | 0 | 1 | 0 | 0 | 0 | 0 |
| 80 | 1988 | 1616.349 | 75.9 | 0 | 0 | 1 | 0 | 0 | 0 |
| 30 | 1978 | 729.457 | 72.4 | 0 | 0 | 1 | 0 | 0 | 0 |
| 111 | 1993 | 3286.558 | 75.5 | 0 | 0 | 0 | 0 | 0 | 1 |
| 179 | 2005 | 3429.955 | 79.4 | 0 | 0 | 1 | 0 | 0 | 0 |

| Year | Country | Spending_USD | Life_Expectancy | |
|---|---|---|---|---|
| 123 | 1995 | USA | 3586.745 | 75.7 |
| 235 | 2014 | Great Britain | 3758.935 | 81.4 |
| 176 | 2004 | Japan | 2303.680 | 82.1 |
| 136 | 1998 | Canada | 2200.468 | 78.6 |
| 190 | 2007 | Canada | 3709.615 | 80.5 |
| 108 | 1993 | France | 1753.485 | 77.5 |
| 28 | 1977 | Japan | 340.628 | 75.3 |
| 236 | 2014 | Japan | 4328.364 | 83.7 |
| 73 | 1986 | USA | 1847.773 | 74.7 |
| 58 | 1984 | Canada | 1135.020 | 76.2 |

Mean Squared Error: 7.846016617615249
R^2 Score: 0.3573359515082699

Mean Squared Error: 0.13772868450150377
R^2 Score: 0.9887186991451874

Mean Squared Error: 2.3013732097838697
R^2 Score: 0.8114954509821513