Skip to main content

FutureLearn Week 2: Post 2 of 4

With big data becoming much more important in recent years the need for data scientists has also increased. These data scientists are educated in discovering, collecting, processing, analysing and presenting information gained from these large amounts of data and often compare and combine it with different sources.

The skills these data scientists must have are data visualisation skills, statistical skills, data processing and systems engineering skills, they must have an understanding or have a sufficient level of skill with programming languages as well as mathematical and analytical skills.

Comments

Popular posts from this blog

FutureLearn Week 3: Post 1 of 3

Citizen science can be applied to any and all kinds of scientific observations which require scientific monitoring data. Weather phenomena, climate change impacts in the environment, monitoring of local air quality and also monitoring the effects and impact of climate change on different species of animal. Citizen science for climate simulation and ecology can be incredibly helpful in monitoring many things such as monitoring microplastics in water so that people can be more aware of what products contribute to this and lead efforts to reduce or change the process in which manufacturers produce these products to eliminate it from water sources. 

FutureLearn Week 2: Post 3 of 4

Two of the biggest challenges of big data is Analysing and Visualising the data. Firstly with analysing the data, the size of big data files can sometimes be substantial, there are many things that must be considered before downloading the data, for example the file size, how long the data file will take to download, will all of it be necessary or will part of the file suffice and is there enough storage space within the system itself. Visualisation is way to represent the data in a way that is easier to understand such as word clouds and things of the like. This will aid users in seeing the prominent and key terms from the analysis of the data sets. The first step after downloading the data would be to quality check it to ensure that each field had the appropriate data types in each field and to ensure that the user understood the meaning of each field. Keeping a copy of the original data would be essential as well as each documented version change for each stage of visualisation....