Skip to main content

FutureLearn Week1: Post 1 of 3

Through day to day use of the internet, tracking cookies, social media postings and searches through Google or any other search engine for topics that interest me as a person this will all contribute to big data. 

Comments

Popular posts from this blog

FutureLearn Week 2: Post 1 of 4

Open data has been increasing for some time now with data being made open on various sites globally. There are many advantages to having open data, these advantages include being able to share public data sets so that they can be compared. These open data sources can also be used for environmental purposes or even health issues. Disadvantages of open data would include the fact that the site providing the data would be inherently biased and formed in the opinion of the creator.

Post #1: Definition of Big Data

Big Data  is a term that is used to describe a massive volume of both structured and unstructured  data  that is so  large  it is difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of  data  is too  big  or it moves too fast or it exceeds current processing capacity. Big Data  comes from text, audio, video, and images.  Big Data  is analysed by organisations and businesses for reasons like discovering patterns and trends related to human behaviour and our interaction with technology, which can then be used to make decisions that impact how we live,  work , and play. This Big Data can also  be analysed for insights that lead to better decisions and strategic business moves.

FutureLearn Week 2: Post 3 of 4

Two of the biggest challenges of big data is Analysing and Visualising the data. Firstly with analysing the data, the size of big data files can sometimes be substantial, there are many things that must be considered before downloading the data, for example the file size, how long the data file will take to download, will all of it be necessary or will part of the file suffice and is there enough storage space within the system itself. Visualisation is way to represent the data in a way that is easier to understand such as word clouds and things of the like. This will aid users in seeing the prominent and key terms from the analysis of the data sets. The first step after downloading the data would be to quality check it to ensure that each field had the appropriate data types in each field and to ensure that the user understood the meaning of each field. Keeping a copy of the original data would be essential as well as each documented version change for each stage of visualisation....