Edoardo Ottavianelli

Software Developer

Computer Science Student at Sapienza University. Passionate about Computing, Nature and the whole sphere of Science.

Contact me

Scroll down


[Statistics] Lesson 1

Author: Edoardo Ottavianelli
10/10/2020

Researches about theory (R)

1_R) Describe the notion of statistical population. What is the population in descriptive statistics and what is the population in inferential statistics: point out the differences.

A statistical population is a set of items/events involved in a statistical analysis, we can refer at it as all the data we're interested in. The descriptive statistics, as the name tells us, tries to explain in a better way the data and formats them in a more readable and comprehensive (descriptive of course) way. So, the statistical population in descriptive statistics is all the raw data we would like to understand, present in a readable way or just point out some interesting correlations or pattern between data (producing dataset). Sometimes we can't analyze all the population (e.g. I would like to mark all the university students in the world), so we can just sample the data. The sampling method extrapolates groups of items in the population that can represent all the data. The inferential statistics allow us to generate datasets from some known samples taken from the larger unknown population. The statistical population in the inferential statistics refers to only data we really process to make conclusions on a bigger population [1].

2_R) Describe the notion of statistical attributes/variables and dataset, and explain how a dataset is generated.

A statistical attribute is a characteristic of an entity, e.g. age or nationality of a person. These attributes can take some kind of values, and all the values constitute a domain. A domain is a range of values, and the attribute could change value as some events happens/change. A variable is the objectification of an attribute, so can vary as the time pass. A dataset is a collection of data. The informations involved can be quantitative or qualitative, or they can even miss. In datasets we have some columns and rows. The columns define the variables involved for a specific attribute and the rows define the values for a specific unit for all the attributes. This is an example of a dataset.

Company Revenue (2019) Employees Country
Google 160.74 bln $ 115'000 USA
Ferrari 3.767 bln EUR 4'285 Italy
GigaByte 1.94 bln $ - Taiwan
IBM 77.14 bln $ 352'600 USA
In this case we represent our statistical units with the company name. We have a dataset involving 3 attributes: the revenue of the 2019, the number of employees and the nation of the headquarter. There are 4 rows here, so our dataset describes 4 entity. All the data involved in the records are variables, because they are a particular instance of an attributes, for a particular entity; and these values could change as the time pass.

3_R) Explain the differences between a (univariate) dataset and a (univariate) frequency distribution. Given a distribution can we reconstruct the dataset? why ? How would you describe the change of amount of information passing from the dataset to the distribution?

A (univariate) dataset correlates the statistical units to the relative variables involved in the analysis. To product a (univariate) statistical distribution from the previous dataset we have to:

  • Define some assumptions/constraints for the groups (requirements to be in a particular group)
  • Group the units into sets respecting the contraints explained before
  • Count the relative frequency
For example, if we have this dataset:
Person Age
Edoardo 23
Francesca 45
Manuele 1
Chiara 22
Mario 56
Francesco 98
Cristian 33
Zaira 77
Marco 16
Gianna 19
We have to, first of all, define the groups and the constraints of the groups: In this case I would like to define 4 groups, 0-25, 25-50, 50,-75 and 75-100 (with the extreme right value excluded). So, the distribution will be:
distribution
I can't go back from the distribution to the dataset, here I lose the association between units and values (as you can see I can only point out some characteristics of the data involved, like 5 people have from 0 to 24 (included) years old, but I can't know how many years old is Edoardo for example). In this case (but also in general of course) I have a loss of information, but I gain knowledge. There is another important fact: I respect the privacy of people involved in this analysis (still, I can't know how many years old is Edoardo) and of course I care about it.


Applications / Practice (A)

1_A) Create, in both languages C# and VB.NET, a program which does the following simple tasks:
  • when a button is pressed some text appears in a richtexbox on the startup form
  • when another button is pressed the richtextbox is cleared
  • when the mouse enters the richtextbox, the richtext backcolor is switched to another color
  • when the mouse leaves the richtextbox, the richtext backcolor is reset to its original state

(The cursor isn't shown because Windows is running in a Virtual Machine)

2_A) Create or search some simple but illuminating example of code which clearly shows the different behaviors of reference value data types and value type data types.

The Types in .NET Framework, but this is applied for all programming languages having concepts of pointers, are either treated by Value Type or by Reference Type.

"A Value Type holds the data within its own memory allocation and a Reference Type contains a pointer to another memory location that holds the real data. Reference Type variables are stored in the heap while Value Type variables are stored in the stack" [2].
reference and value types
Code Example: Passing Value Type Variables
Here the value is copied when the method is called so the real value changes only inside the function.
static void ChangeValue(int x)
{
    x =  200;
    Console.WriteLine(x);
}

static void Main(string[] args)
{
    int i = 100;

    Console.WriteLine(i);
    
    ChangeValue(i);
    
    Console.WriteLine(i);
}

// OUTPUT:
100
200
100 
Code Example: Passing Reference Type Variable
Here instead the value passed to the method is the address of the variable, so the real value changes also outside the function.
static void ChangeReferenceType(Student std2)
{
    std2.StudentName = "Steve";
}
                                
static void Main(string[] args)
{
    Student std1 = new Student();
    std1.StudentName = "Bill";
                                    
    ChangeReferenceType(std1);
                                
    Console.WriteLine(std1.StudentName);
}

// OUTPUT:
Steve

3_A) Search on the web how to drag drop the name of any file into a richtextbox on your startup form and try to implement this feature in your first program.

(The cursor isn't shown because Windows is running in a Virtual Machine)


Researches about applications (RA)

1_RA) Observe carefully the different way C# and VB.NET deals with events and the different ways to define the event handlers. Discuss in your blog what differences you can spot. Which way do you find easier or more comfortable and why?

First of all, let's clarify what an event is. An event in programming languages is something that changes its state in a certain moment (like an input, mouse over an object, a click, a button pressed etc...). I can handle these events and trigger actions after these events happens. We can do these in both C# and VB.NET.
In C#, when I would like to catch an event and fire up an action I have to declare a list of actions that will be called simultaneously. In my first C# program I created a Button for example

this.button1 = new System.Windows.Forms.Button();
and then I added an Event Handler:
this.button1.Click += new System.EventHandler(this.button1_Click);
Let's analyze this. I created a list of actions (in this case containing only one action) to perform when the button is clicked. I added to this list an event handler that will handle the event and when this will happens the function will be called [7].
Instead with VB.NET I did the same thing but in a different way. I created a button object
Me.Button1 = New System.Windows.Forms.Button()
but I don't have to create a list of actions, instead I have to declare which event the function will handle directly with the function definition. Like in this case:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
    RichTextBox1.Text = "Hello, World!"
End Sub
In this way I'm saying the same thing as before, so when the button is clicked (Handles Button1.Click) executes this function.[8]
I don't actually see too many differences, but there are a few small details that changed my opinion and led me to like the C# way more. In this particular topic (event handling) I find VB.NET way easier and more readable because I have on the same line the function definition and the event handled. But VB.NET comes from languages as Visual Basic, while C# takes inspiration from C-like languages and for instance Java deeply. Since I learned Java in my Bachelor Degree course I feel myself more comfortable when coding with C#.

2_RA) Note that any C# will have a Program.cs file in its solution folder while VB.NET does not. On the other hand, VB.NET has the file Application.Designer.vb within the project folder. Try to research what these (automatically created) files are doing in your application and try to discover / reverse engineer the differences on how a C# and VB.NET program are started.

Program.cs is an autogenerated file that contains the main entry point of the application, so it's a function called when the program starts (like all C-like languages). It's required to execute the program. In my first C# program the Main function just creates a new Form1 object.
Instead, Application.Designer.vb is a file created by Visual Studio while I'm editing the Form with the designer tool. It contains some settings as the default startup form (in my case Form1), if it has to save settings on exit, if it has to enable visual styles, which type of application we're running etc...
So we don't have a Main actually, but we can add it. By the way, this method in Application.Designer.vb helps us defining the starting form of our application.

Protected Overrides Sub OnCreateMainForm()
    Me.MainForm = Global.WindowsApp1.Form1
End Sub 
                                


References

[1] Statistics.Laerd.com - Descriptive and inferential statistics
[2] NET-Informations - Difference between a Value Type and a Reference Type
[3] Microsoft Docs - Value types (C# reference)
[4] Tutorial Teacher - Value Type and Reference Type
[5] StackOverflow - Drag & drop and get file path in VB Net
[6] Microsoft Support - How to provide file drag and drop functionality in a Visual C application
[7] StackOverflow - Understanding events and event handlers in C#
[8] Microsoft Docs - Events (Visual Basic)