Least Squares Method & Error Estimations

Least Squares Method & Error Estimations¶

FIZ228 - Numerical Analysis
Dr. Emre S. Tasci, Hacettepe University

Note

This lecture is heavily benefited from Steven Chapra’s Applied Numerical Methods with MATLAB: for Engineers & Scientists.

Data & Import¶

The free-fall data we will be using is taken from: D. Horvat & R. Jecmenica, “The Free Fall Experiment” Resonance 21 259-275 (2016) [https://doi.org/10.1007/s12045-016-0321-9].

Here’s the content of our data file:

Vert.dist. y/m,0.2,0.7,0.8,1.2,1.3,1.8
,t/s,t/s,t/s,t/s,t/s,t/s
1,0.20188,0.37775,0.40391,0.49475,0.51501,0.60603
2,0.20197,0.37779,0.40397,0.49464,0.51491,0.60610
3,0.20193,0.37770,0.40391,0.49480,0.51496,0.60610
4,0.20193,0.37779,0.40386,0.49476,0.51495,0.60608
5,0.20194,0.37779,0.40381,0.49478,0.51492,0.60605
6,0.20181,0.37780,0.40386,0.49469,0.51495,0.60612
7,0.20181,0.37773,0.40390,0.49463,0.51507,0.60601
8,0.20196,0.37774,0.40391,0.49466,0.51494,0.60606
9,0.20195,0.37774,0.40389,0.49462,0.51506,0.60613
10,0.20182,0.37776,0.40380,0.49474,0.51506,0.60603
<t>/s,0.20190,0.37776,0.40388,0.49471,0.51498,0.60607

03_FreeFallData.csv

import pandas as pd
data1 = pd.read_csv("data/03_FreeFallData.csv")
data1.columns
data1

	Vert.dist. y/m	0.2	0.7	0.8	1.2	1.3	1.8
0	NaN	t/s	t/s	t/s	t/s	t/s	t/s
1	1	0.20188	0.37775	0.40391	0.49475	0.51501	0.60603
2	2	0.20197	0.37779	0.40397	0.49464	0.51491	0.60610
3	3	0.20193	0.37770	0.40391	0.49480	0.51496	0.60610
4	4	0.20193	0.37779	0.40386	0.49476	0.51495	0.60608
5	5	0.20194	0.37779	0.40381	0.49478	0.51492	0.60605
6	6	0.20181	0.37780	0.40386	0.49469	0.51495	0.60612
7	7	0.20181	0.37773	0.40390	0.49463	0.51507	0.60601
8	8	0.20196	0.37774	0.40391	0.49466	0.51494	0.60606
9	9	0.20195	0.37774	0.40389	0.49462	0.51506	0.60613
10	10	0.20182	0.37776	0.40380	0.49474	0.51506	0.60603
11	<t>/s	0.20190	0.37776	0.40388	0.49471	0.51498	0.60607

We don’t need the first row and first column, so let’s remove them via pandas.DataFrame.drop:

data1.drop(0, inplace=True)
data1.drop(11, inplace=True)
data1.drop(['Vert.dist. y/m'],axis=1, inplace=True)
data1

	0.2	0.7	0.8	1.2	1.3	1.8
1	0.20188	0.37775	0.40391	0.49475	0.51501	0.60603
2	0.20197	0.37779	0.40397	0.49464	0.51491	0.60610
3	0.20193	0.37770	0.40391	0.49480	0.51496	0.60610
4	0.20193	0.37779	0.40386	0.49476	0.51495	0.60608
5	0.20194	0.37779	0.40381	0.49478	0.51492	0.60605
6	0.20181	0.37780	0.40386	0.49469	0.51495	0.60612
7	0.20181	0.37773	0.40390	0.49463	0.51507	0.60601
8	0.20196	0.37774	0.40391	0.49466	0.51494	0.60606
9	0.20195	0.37774	0.40389	0.49462	0.51506	0.60613
10	0.20182	0.37776	0.40380	0.49474	0.51506	0.60603

Be careful that the data have been imported as string (due to the first row initially being formed of strings 8P ):

data1.loc[2,"0.7"]

'0.37779'

type(data1.loc[2,"0.7"])

str

data1.dtypes

2    object
7    object
8    object
2    object
3    object
8    object
dtype: object

So, let’s set them all to float:

data1 = data1.astype('float')
data1.dtypes

2    float64
7    float64
8    float64
2    float64
3    float64
8    float64
dtype: object

data1

	0.2	0.7	0.8	1.2	1.3	1.8
1	0.20188	0.37775	0.40391	0.49475	0.51501	0.60603
2	0.20197	0.37779	0.40397	0.49464	0.51491	0.60610
3	0.20193	0.37770	0.40391	0.49480	0.51496	0.60610
4	0.20193	0.37779	0.40386	0.49476	0.51495	0.60608
5	0.20194	0.37779	0.40381	0.49478	0.51492	0.60605
6	0.20181	0.37780	0.40386	0.49469	0.51495	0.60612
7	0.20181	0.37773	0.40390	0.49463	0.51507	0.60601
8	0.20196	0.37774	0.40391	0.49466	0.51494	0.60606
9	0.20195	0.37774	0.40389	0.49462	0.51506	0.60613
10	0.20182	0.37776	0.40380	0.49474	0.51506	0.60603

While we’re at it, let’s do a couple of make-overs:

data1.reset_index(inplace=True,drop=True)
data1

	0.2	0.7	0.8	1.2	1.3	1.8
0	0.20188	0.37775	0.40391	0.49475	0.51501	0.60603
1	0.20197	0.37779	0.40397	0.49464	0.51491	0.60610
2	0.20193	0.37770	0.40391	0.49480	0.51496	0.60610
3	0.20193	0.37779	0.40386	0.49476	0.51495	0.60608
4	0.20194	0.37779	0.40381	0.49478	0.51492	0.60605
5	0.20181	0.37780	0.40386	0.49469	0.51495	0.60612
6	0.20181	0.37773	0.40390	0.49463	0.51507	0.60601
7	0.20196	0.37774	0.40391	0.49466	0.51494	0.60606
8	0.20195	0.37774	0.40389	0.49462	0.51506	0.60613
9	0.20182	0.37776	0.40380	0.49474	0.51506	0.60603

Plotting¶

import seaborn as sns
sns.set_theme() # To make things appear "more beautiful" 8)

data2 = data1.copy()
data2

	0.2	0.7	0.8	1.2	1.3	1.8
0	0.20188	0.37775	0.40391	0.49475	0.51501	0.60603
1	0.20197	0.37779	0.40397	0.49464	0.51491	0.60610
2	0.20193	0.37770	0.40391	0.49480	0.51496	0.60610
3	0.20193	0.37779	0.40386	0.49476	0.51495	0.60608
4	0.20194	0.37779	0.40381	0.49478	0.51492	0.60605
5	0.20181	0.37780	0.40386	0.49469	0.51495	0.60612
6	0.20181	0.37773	0.40390	0.49463	0.51507	0.60601
7	0.20196	0.37774	0.40391	0.49466	0.51494	0.60606
8	0.20195	0.37774	0.40389	0.49462	0.51506	0.60613
9	0.20182	0.37776	0.40380	0.49474	0.51506	0.60603

plt1 = sns.relplot(data=data2,kind="line",marker="o")
k =plt1.set(xticks=data2.index)

_images/FIZ228_03_LeastSquaresErrors_18_0.png

data2.mean()

2    0.201900
7    0.377759
8    0.403882
2    0.494707
3    0.514983
8    0.606071
dtype: float64

data_stats = pd.DataFrame(data2.mean())
data_stats.rename(columns={0:'dmean'}, inplace=True )
data_stats['dvar'] = data2.var()
data_stats['dstd'] = data2.std()
data_stats

	dmean	dvar	dstd
0.2	0.201900	4.155556e-09	0.000064
0.7	0.377759	1.076667e-09	0.000033
0.8	0.403882	2.595556e-09	0.000051
1.2	0.494707	4.467778e-09	0.000067
1.3	0.514983	3.778889e-09	0.000061
1.8	0.606071	1.698889e-09	0.000041

data_stats.dstd # Unbiased

2    0.000064
7    0.000033
8    0.000051
2    0.000067
3    0.000061
8    0.000041
Name: dstd, dtype: float64

\[\sigma = \sqrt{\frac{\sum_{i}{\left(x_i - \bar{x}\right)^2}}{N}}\;\text{(Population)}\]

\[\sigma = \sqrt{\frac{\sum_{i}{\left(x_i - \bar{x}\right)^2}}{N-1}}\;\text{(Sample)}\]

import numpy as np
N = data2.shape[0]
for coll in list(data2.columns):
    s_dev = 0
    s_mean = data2.loc[:,coll].mean()
    #print(s_mean)
    for x_i in data2.loc[:,coll]:
        # print (x_i)
        s_dev += (x_i - s_mean)**2
    s_dev = np.sqrt(s_dev/(N))
    print("{:s}: {:.6f}".format(coll,s_dev))

2: 0.000061
7: 0.000031
8: 0.000048
2: 0.000063
3: 0.000058
8: 0.000039

data2.std(ddof=0) # Biased

2    0.000061
7    0.000031
8    0.000048
2    0.000063
3    0.000058
8    0.000039
dtype: float64

Average over all the sample deviations should be equal to the deviation of the population!

For more information on this Bessel’s Correction, check: http://mathcenter.oxford.emory.edu/site/math117/besselCorrection/

Types of Errors¶

True error (\(E_t\))¶

\[E_t = \text{true value} - \text{approximation}\]

Absolute error (\(\left|E_t\right|\))¶

\[\left|E_t\right| = \left|\text{true value} - \text{approximation}\right|\]

True fractional relative error¶

\[\frac{\text{true value} - \text{approximation}}{\text{true value}}\]

True percent relative error (\(\varepsilon_t\))¶

\[\varepsilon_t = \frac{\text{true value} - \text{approximation}}{\text{true value}}\,100\%\]

But what if we don’t know the true value?..

Approximate percent relative error (\(\varepsilon_a\))¶

\[\varepsilon_a = \frac{\text{present approximation} - \text{previous approximation}}{\text{present approximation}}\,100\%\]

Computations are repeated until \(\left|\varepsilon_a\right| < \left|\varepsilon_s\right|\) (\(\varepsilon_s\) is the satisfactory precision criterion).

The result is correct to at least \(n\) significant figures given that:

\[\varepsilon_s=\left(0.5\times10^{2-n}\right)\,\%\]

Example:¶

\[e^x = 1+x+\frac{x^2}{2}+\frac{x^3}{3!}+\dots+\frac{x^n}{n!}+\dots\]

To estimate \(e^{0.5}\) so that the absolute value of the approximate error estimate falls below an error criterion conforming to 3 significant figures, how many terms do you need to include?

Solution:

\(\varepsilon_s=(0.5\times 10^{2-3})\,\%\)

eps_s = 0.5*10**(2-3)
print("{:.2f}%".format(eps_s))

0.05%

eps_s = 0.5*10**(2-3)

x = 0.5
eps_a = 1000
e_prev = 0
no_of_terms = 0
print("{:>3} {:^13}\t{:^10}".format("#","e","E_a"))
while(eps_a > eps_s):
    e_calculated = e_prev + x**no_of_terms/np.math.factorial(no_of_terms)
    eps_a = np.abs(((e_calculated - e_prev)/e_calculated))*100
    print("{:3d}:{:10.6f}\t{:16.5f}".format(no_of_terms+1,e_calculated,eps_a))
    e_prev = e_calculated
    no_of_terms += 1

  #       e      	   E_a    
 1.000000	       100.00000
 1.500000	        33.33333
 1.625000	         7.69231
 1.645833	         1.26582
 1.648438	         0.15798
 1.648698	         0.01580

eps_s = 0.5*10**(2-3)
e_sqrt_real = np.sqrt(np.e)

x = 0.5
eps_a = 1000
e_sqrt_prev = 0
no_of_terms = 0
print("{:>3} {:^13}{:^20}{:^12}".format("#","e_calc","E_a","E_t"))
while(eps_a > eps_s):
    e_sqrt_calculated = e_sqrt_prev + x**no_of_terms/np.math.factorial(no_of_terms)
    eps_a = np.abs(((e_sqrt_calculated - e_sqrt_prev)/e_sqrt_calculated))*100
    eps_t = np.abs((e_sqrt_real - e_sqrt_calculated)/e_sqrt_real)*100
    print("{:3d}:{:10.6f}{:16.5f}{:16.5f}".format(no_of_terms+1,e_sqrt_calculated,eps_a,eps_t))
    e_sqrt_prev = e_sqrt_calculated
    no_of_terms += 1

  #    e_calc            E_a             E_t     
 1.000000       100.00000        39.34693
 1.500000        33.33333         9.02040
 1.625000         7.69231         1.43877
 1.645833         1.26582         0.17516
 1.648438         0.15798         0.01721
 1.648698         0.01580         0.00142

How good is a mean?¶

Once again, consider our free fall data:

data2

	0.2	0.7	0.8	1.2	1.3	1.8
0	0.20188	0.37775	0.40391	0.49475	0.51501	0.60603
1	0.20197	0.37779	0.40397	0.49464	0.51491	0.60610
2	0.20193	0.37770	0.40391	0.49480	0.51496	0.60610
3	0.20193	0.37779	0.40386	0.49476	0.51495	0.60608
4	0.20194	0.37779	0.40381	0.49478	0.51492	0.60605
5	0.20181	0.37780	0.40386	0.49469	0.51495	0.60612
6	0.20181	0.37773	0.40390	0.49463	0.51507	0.60601
7	0.20196	0.37774	0.40391	0.49466	0.51494	0.60606
8	0.20195	0.37774	0.40389	0.49462	0.51506	0.60613
9	0.20182	0.37776	0.40380	0.49474	0.51506	0.60603

data_stats

	dmean	dvar	dstd
0.2	0.201900	4.155556e-09	0.000064
0.7	0.377759	1.076667e-09	0.000033
0.8	0.403882	2.595556e-09	0.000051
1.2	0.494707	4.467778e-09	0.000067
1.3	0.514983	3.778889e-09	0.000061
1.8	0.606071	1.698889e-09	0.000041

scipy.optimize.minimize to the rescue!¶

data2['0.2'].shape

(10,)

from scipy.optimize import minimize

def fun_err(m,x):
    err = 0
    for x_i in x:
        err += (x_i - m)**2
    err = np.sqrt(err/x.shape[0])
    return err

fun_err(data2['0.2'].mean(),data2['0.2'])

6.11555394056866e-05

fun_err(data2['0.2'].mean()+1,data2['0.2'])

1.00000000187

minimize(fun_err,data2['0.2'].mean(),args=(data2['0.2']),tol=1E-3)

      fun: 6.11555394056866e-05
 hess_inv: array([[1]])
      jac: array([0.00012183])
  message: 'Optimization terminated successfully.'
     nfev: 2
      nit: 0
     njev: 1
   status: 0
  success: True
        x: array([0.2019])

data2['0.2'].mean()

0.20190000000000002

list(data2.columns)

['0.2', '0.7', '0.8', '1.2', '1.3', '1.8']

data_stats.loc['0.2','dmean']

0.20190000000000002

print("{:^5}: {:^8} ({:^8})".format("col","min","mean"))
for col in list(data2.columns):
    res_min = minimize(fun_err,1,args=(data2[col]))
    print("{:^5}: {:8.6f} ({:8.6f})".format(col,float(res_min.x),data_stats.loc[col,'dmean']))

 col :   min    (  mean  )
2 : 0.201900 (0.201900)
7 : 0.377759 (0.377759)
8 : 0.403882 (0.403882)
2 : 0.494707 (0.494707)
3 : 0.514983 (0.514983)
8 : 0.606071 (0.606071)

Couldn’t the cost function be better?¶

def fun_err2(m,x):
    err = 0
    for x_i in x:
        err += (x_i - m)**2
    #err = np.sqrt(err/x.shape[0])
    return err

fun_err2(data2['0.2'].mean(),data2['0.2'])

3.740000000000487e-08

minimize(fun_err2,data2['0.2'].mean(),args=(data2['0.2']))

      fun: 3.740000000000487e-08
 hess_inv: array([[1]])
      jac: array([1.49011612e-07])
  message: 'Optimization terminated successfully.'
     nfev: 2
      nit: 0
     njev: 1
   status: 0
  success: True
        x: array([0.2019])

print("{:^5}: {:^8} ({:^8})".format("col","min","mean"))
for col in list(data2.columns):
    res_min = minimize(fun_err2,1,args=(data2[col]))
    print("{:^5}: {:8.6f} ({:8.6f})".format(col,float(res_min.x),data_stats.loc[col,'dmean']))

 col :   min    (  mean  )
2 : 0.201900 (0.201900)
7 : 0.377759 (0.377759)
8 : 0.403882 (0.403882)
2 : 0.494707 (0.494707)
3 : 0.514983 (0.514983)
8 : 0.606071 (0.606071)

def fun_err3(m,x):
    err = 0
    for x_i in x:
        err += np.abs(x_i - m)
    #err = np.sqrt(err/x.shape[0])
    return err

fun_err3(data2['0.2'].mean(),data2['0.2'])

0.0005599999999999772

minimize(fun_err3,data2['0.2'].mean(),args=(data2['0.2']))

      fun: 0.0005000017699288428
 hess_inv: array([[7.97330502e-06]])
      jac: array([1.76244418])
  message: 'Desired error not necessarily achieved due to precision loss.'
     nfev: 120
      nit: 1
     njev: 54
   status: 2
  success: False
        x: array([0.20193])

data_exp = pd.DataFrame(data_stats.dmean)

data_exp

	dmean
0.2	0.201900
0.7	0.377759
0.8	0.403882
1.2	0.494707
1.3	0.514983
1.8	0.606071

def freefall_err(g,y_exp,t):
    err = 0
    y_theo = 0.5*g*t**2
    err = (y_theo - y_exp)**2
    return np.sum(err)

y_exp = np.array(list(data_exp.index))
print(y_exp)

['0.2' '0.7' '0.8' '1.2' '1.3' '1.8']

y_exp.dtype

dtype('<U3')

y_exp = np.array(list(data_exp.index),dtype=float)
y_exp.dtype

dtype('float64')

print(y_exp)

[0.2 0.7 0.8 1.2 1.3 1.8]

t = np.array(list(data_exp.dmean[:]))
print(t)

[0.2019   0.377759 0.403882 0.494707 0.514983 0.606071]

We can do that manually!… (???)¶

freefall_err(9,y_exp,t)

0.05069082651399326

freefall_err(9.1,y_exp,t)

0.0388634097682171

for g in np.arange(9,10,0.1):
    print("{:5.3f}:{:10.6f}".format(g,freefall_err(g,y_exp,t)))

000:  0.050691
100:  0.038863
200:  0.028605
300:  0.019915
400:  0.012795
500:  0.007243
600:  0.003261
700:  0.000847
800:  0.000002
900:  0.000726

for g in np.arange(9.7,9.9,0.01):
    print("{:5.3f}:{:10.6f}".format(g,freefall_err(g,y_exp,t)))

700:  0.000847
710:  0.000692
720:  0.000552
730:  0.000429
740:  0.000321
750:  0.000228
760:  0.000152
770:  0.000091
780:  0.000045
790:  0.000016
800:  0.000002
810:  0.000004
820:  0.000021
830:  0.000055
840:  0.000103
850:  0.000168
860:  0.000248
870:  0.000344
880:  0.000456
890:  0.000583
900:  0.000726

for g in np.arange(9.79,9.81,0.001):
    print("{:5.3f}:{:10.8f}".format(g,freefall_err(g,y_exp,t)))

790:0.00001591
791:0.00001382
792:0.00001188
793:0.00001010
794:0.00000848
795:0.00000701
796:0.00000570
797:0.00000455
798:0.00000355
799:0.00000272
800:0.00000203
801:0.00000151
802:0.00000114
803:0.00000093
804:0.00000088
805:0.00000098
806:0.00000124
807:0.00000166
808:0.00000223
809:0.00000296
810:0.00000385

res_min = minimize(freefall_err,x0=1,args=(y_exp,t))
print(res_min)

      fun: 8.748023164225757e-07
 hess_inv: array([[6.3736979]])
      jac: array([-2.0061961e-09])
  message: 'Optimization terminated successfully.'
     nfev: 10
      nit: 4
     njev: 5
   status: 0
  success: True
        x: array([9.8038438])

import matplotlib.pyplot as plt
plt.plot(t,y_exp,"or")
tt = np.linspace(0,0.7,100)
plt.plot(tt,0.5*res_min.x*tt**2,"-b")
plt.show()

_images/FIZ228_03_LeastSquaresErrors_68_0.png

Least Squares (numpy.linalg.lstsq & scipy.linalg.lstsq & scipy.optimize.least_squares) but not least! 8)¶

NumPy and SciPy’s linalg.lstsq functions works similar to each other, solving the matrix equation \(Ax=b\) but as the coefficients matrix \(A\) must be defined as a matrix, we add a “zeros” column next to \(\tfrac{1}{2}t^2\) values, to indicate that our equation is of the form:

\[\left(\frac{1}{2}t^2\right)g^1 + (0)g^0 = y_{exp}\]

A = np.vstack([(0.5*t**2),np.zeros(len(t))]).T
A

array([[0.02038181, 0.        ],
       [0.07135093, 0.        ],
       [0.08156033, 0.        ],
       [0.12236751, 0.        ],
       [0.13260375, 0.        ],
       [0.18366103, 0.        ]])

numpy.linalg.lstsq ¶

g_ls_np, _ = np.linalg.lstsq(A,y_exp,rcond=None)[0]
g_ls_np

9.803843815755029

scipy.linalg.lstsq ¶

import scipy as sp
g_ls_sp, _ = sp.linalg.lstsq(A,y_exp)[0]
g_ls_np

9.803843815755029

scipy.optimize.least_squares ¶

SciPy’s optimize.least_squares is a totally different beast, though. It tries to minimize the cost function (e.g., errors). Its main difference from the above two is that it supports nonlinear least-squares problems and also accepts boundaries on the variables. It is included here only to give you an idea (as it is more or less the same with the optimize.minimize ;)

def fun_err_g(g):
    return (0.5*g*t**2 - y_exp)**2

g_ls_sp_opt = sp.optimize.least_squares(fun_err_g,10).x[0]
g_ls_sp_opt

9.81433312096714

Various Definitions¶

Sum of the squares of the data residuals (\(S_t\))¶

\[S_t = \sum_{i}{\left(y_i - \bar{y}\right)^2}\]

(\(\leftrightarrow\) Standard deviation \(\sigma=\sqrt{\frac{S_t}{n-1}}\), variance \(\sigma^2=\frac{\sum_{i}{\left(y_i-\bar{y}\right)^2}}{n-1}=\frac{\sum_{i}{y_i^2}-\left(\sum_{i}{y_i}\right)^2/n}{n-1}\))

Coefficient of variation¶

\[\text{c.v.} = \frac{s_y}{\bar{y}}\times\,100\%\]

Sum of the squares of the estimate residuals (\(S_r\))¶

\[S_r = \sum_{i}{e_i^2}=\sum_{i}\left(y_i-a_0-a_1 x_i\right)^2\]

Standard error of the estimate: (\(s_{y/x}\))¶

\[s_{y/x}= \sqrt{\frac{S_r}{n-2}}\]

(a) \(S_t\), (b) \(S_r\)

(Source: S.C. Chapra, Applied Numerical Methods with MATLAB)

Coefficient of Determination (\(r^2\))¶

\[r^2 = \frac{S_t-S_r}{S_t}\]

References¶

D. Horvat & R. Jecmenica, “The Free Fall Experiment” Resonance 21 259-275 (2016) [https://doi.org/10.1007/s12045-016-0321-9]
This lecture is heavily benefited from Steven Chapra’s Applied Numerical Methods with MATLAB: for Engineers & Scientists.

Visualization of datasets

Regression

FIZ228 - Numerical Analysis Lecture Notes

Least Squares Method & Error Estimations

Contents

Least Squares Method & Error Estimations¶

Data & Import¶

Plotting¶

Types of Errors¶

True error (\(E_t\))¶

Absolute error (\(\left|E_t\right|\))¶

True fractional relative error¶

True percent relative error (\(\varepsilon_t\))¶

Approximate percent relative error (\(\varepsilon_a\))¶

Example:¶

How good is a mean?¶

scipy.optimize.minimize to the rescue!¶

Couldn’t the cost function be better?¶

We can do that manually!… (???)¶

Least Squares (numpy.linalg.lstsq & scipy.linalg.lstsq & scipy.optimize.least_squares) but not least! 8)¶

numpy.linalg.lstsq ¶

scipy.linalg.lstsq ¶

scipy.optimize.least_squares ¶

Various Definitions¶

Sum of the squares of the data residuals (\(S_t\))¶

Coefficient of variation¶

Sum of the squares of the estimate residuals (\(S_r\))¶

Standard error of the estimate: (\(s_{y/x}\))¶

Coefficient of Determination (\(r^2\))¶

References¶

FIZ228 - Numerical Analysis Lecture Notes

Least Squares Method & Error Estimations

Contents

Least Squares Method & Error Estimations¶

Data & Import¶

Plotting¶

Types of Errors¶

True error (\(E_t\))¶

Absolute error (\(\left|E_t\right|\))¶

True fractional relative error¶

True percent relative error (\(\varepsilon_t\))¶

Approximate percent relative error (\(\varepsilon_a\))¶

Example:¶

How good is a mean?¶

scipy.optimize.minimize to the rescue!¶

Couldn’t the cost function be better?¶

We can do that manually!… (???)¶

Least Squares (numpy.linalg.lstsq & scipy.linalg.lstsq & scipy.optimize.least_squares) but not least! 8)¶

numpy.linalg.lstsq¶

scipy.linalg.lstsq¶

scipy.optimize.least_squares¶

Various Definitions¶

Sum of the squares of the data residuals (\(S_t\))¶

Coefficient of variation¶

Sum of the squares of the estimate residuals (\(S_r\))¶

Standard error of the estimate: (\(s_{y/x}\))¶

Coefficient of Determination (\(r^2\))¶

References¶

numpy.linalg.lstsq ¶

scipy.linalg.lstsq ¶

scipy.optimize.least_squares ¶