Unit 4 Outcome 1 | softwaredevelopment

Unit 4 Outcome ONE

U4O1 Key knowledge

Data and information

ways in which file size, storage medium and organisation of files affect access of data
uses of data structures to organise and manipulate data, including associative arrays (or dictionaries or hash tables)

Digital systems

procedures and techniques for handling and managing files, including security, archiving, backing up and disposing of files

Approaches to problem solving

processing features of a programming language, including instructions, procedures, methods, functions and control structures
algorithms for sorting, including selection sort and quick sort and their suitability for a given purpose, measured in terms of algorithm complexity and sort time
characteristics of efficient and effective solutions
techniques for checking that coded solutions meet design specifications, including construction of test data
validation techniques, including existence checking, range checking and type checking
techniques for testing the useability of solutions and forms of documenting test results
techniques for recording the progress of projects, including annotations, adjustments to tasks and timeframes, and logs
factors that influence the effectiveness of project plans
strategies for evaluating the efficiency and effectiveness of solutions and project plans.

U4O1 Key skills

organise and manage data and files
code solutions and write internal documentation
select and apply testing techniques to confirm that solutions operate as intended, and make necessary modifications
prepare and conduct useability tests using appropriate techniques, capture results, and make any necessary modifications to solutions
monitor and adjust project plans, where appropriate, and assess their usefulness in managing projects
evaluate the efficiency and effectiveness of solutions based on the criteria stated in the design.

Jansen Text Book

Chapter 3
Chapter 7
Chapter 8

Ways in which file size, storage medium and organisation of files affect access of data

When developing a solution, it is important to consider what data will be INPUT into the system. The system will only be effective if the data input is valid and correct.

When formatting and storing data it is important to consider the following issues:

How soon do I need the data back if lost?
How fast do I need to access the data?
How long do I need to retain data?
How secure does it need to be?
What regulatory requirements need to be adhered to?

Structure your data:

A text file of unorganised values is not going to be easy to access, sort or process so there are a number of ways data can be structured.

CSV files are Comma separated Value format. Each value datum is separated from the others with a comma character. CSV is a delimited data format that has fields/columns separated by the comma character and records/rows terminated by new lines.

Example: Two records with five fields.

“Bill”, “Jones”, 1967, Michellier St, 04567823465,

“Fred”, “Smith”, 1981, Terrious St, 0475337467

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It uses tags in the same way that HTML tags format a web page. XML tags format records and fields.

Example: Two records with four fields.

<breakfast_menu>

<food>

<name>Belgian Waffles</name>

<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>

</food>

<food>

<name>Strawberry Belgian Waffles</name>

<description>Light Belgian waffles covered with strawberries and whipped cream</description>

</food>

</breakfast menu>

These structures enable access to database formatted data with minimum impact on the amount of storage required.

Storage Media

Data storage is the recording and storing of information (data) in a storage medium. Recording is accomplished by virtually any form of energy. Electronic data storage requires electrical power to store and retrieve data. Data storage in a digital, machine-readable medium is sometimes called digital data.

Barcodes and magnetic ink character recognition (MICR) are two ways of recording machine-readable data on paper.

Electronic storage of data can be grouped into - Primary, Secondary and Tertiary.

Primary Storage includes the RAM and ROM that directly support the CPU it is volatile – which means all data is lost after the device is powered down.

Secondary Storage differs from primary storage in that it is not directly accessible by the CPU. The computer usually uses its input/output channels to access secondary storage and transfers the desired data using intermediate area in primary storage. Secondary storage does not lose the data when the device is powered down—it is non-volatile.

Examples of Secondary Devices include:

Hard Drives (these can be in-built to a computer system or be stand-alone external drives)
CD/ROM
DVD
flash memory (e.g. USB flash drives or keys),
magnetic tape,
standalone RAM disks,
Zip drives.

Tertiary Storage typically, it involves an automatic mechanism that will attach removable mass storage media into a storage device when required. Data are often copied to secondary storage before use. It is primarily used for archiving rarely accessed information since it is much slower than secondary storage. This is primarily useful for extraordinarily large data stores, accessed without human operators. Typical examples include tape libraries and full back-ups.

Uses of data structures to organise and manipulate data, including associative arrays (or dictionaries or hash tables)

An Array is a data structure which holds a fixed number of values of a single type. Their size is set when the array is declared.

A One-Dimentional Array illustrated below is a list of product data with associated indexes. Product (4) As String is an array with 5 data elements in it. Each element has an index number associated it with it. The index allows the array to be easily searched and manipulated.

Product(0) ="eggs"

Product(1) = "milk"

Product(2) = "bread"

Product(3) = "cheese"

Product(4) = "tomatoes"

A Multi-Dimentional Array as illustrated below has more than one list. The other name for this type of array is an Associative Array. Below is a set of values and their associated soccer team positions. Team (10,1) As String holds a table of two columns (0 and 1) and eleven rows (0 - 10)

Team(0, 0) = " 1 "

Team(0, 1) = " Goal Keeper "

Team(1, 0) = "2 "

Team(1, 1) = " Right Full Back "

Team(2, 0) = "3 "

Team(2, 1) = " Left Full Back "

Team(3, 0) = "4 "

Team(3, 1) = " Centre Half Back "

Team(4, 0) = "5 "

Team(4, 1) = " Centre Half Back "

Team(5, 0) = "6 "

Team(5, 1) = " Defensive Midfielder "

Team(6, 0) = "7 "

Team(6, 1) = " Right Winger "

Team(7, 0) = "8 "

Team(7, 1) = " Central Mid Fielder "

Team(8, 0) = "9 "

Team(8, 1) = " Striker "

Team(9, 0) = "10 "

Team(9, 1) = " Attacking Midfielder "

Team(10, 0) = "11 "

Team(10, 1) = " Left Winger "

A Dictionary is a data structure which has many builtin functions that can add, remove, access the elements using unique key. Compared to alternatives, a Dictionary is easy to use and effective. It has many functions (like ContainsKey and TryGetValue) that do lookups. Thye can hold many data types where as arrays can only hold one.

Module Module1
Sub Main()
' Create a Dictionary.
Dim dictionary As New Dictionary(Of String, Integer)
' Add four entries.
dictionary.Add("Dot", 20)
dictionary.Add("Net", 1)
dictionary.Add("Perls", 10)
dictionary.Add("Visual", -1)
End Sub
End Module

A Hash Table is a data structure which implements all of the dictionary operations but hash tables allow insertion/search and deletion of elements providing the associated keys for each element. Hash Tables are complex solutions that use a calculation with a prime number to find a unique location (key) in an array to make it easier to find when you have a large file of unsorted data.

In a basic address book, you might have:

Bill

Surpreet

Jane

Nqube

Quentin

The problem with locating these names in an address book alphabetically is that is leaves spaces in the table between Bill and Michael empty wasting space. Also a linear search s\would still be required under each alphabetical section.

Hash tables provide a unique identifyer based on the content of the data. If we uses a basic converst\ion of a=1, b=2, c=3 etc we could convert "Jane" to:

(J)11 + (a)1 + (n)15 + (e)5 = 32

Unfortunately if "Neaj" is added, his converted number would also be 32. So we use a Hash Function that allocates values depending on the location of each character in the string. We use a prime number which helps use create unique values (in this example we will use 7).

Hash(Current Letter) = (Prime Number(7) * Hash from previous value) + value of current letter.

Example:

JANE

Hash = 0

Hash (J) = (7 * 0) + 11 = 11

Hash (A) = (7 * 11) + 1= 78

Hash (N) = (7 * 78) + 15 = 561

Hash (E) = (7 * 561) + 5 = 3,932

Hash (JANE) = 3,932

To provide more details of VB examples - see the searching techniques below. Here is a VB tutorial is setting up a Hash Table

Try this site for more information.https://www.dotnetperls.com/hashtable-vbnet

Procedures and techniques for handling and managing files, including security, archiving, backing up and disposing of files

Security

Keeping data from deliberate and accidental threats is essential. Accidentlal threats include: users not saving, accidently deleting files, losing or destroying a USB memory stick, sending data to the wrong destination in an email. Deliberate threats include: malware, phishing, hacking.

When transfering data across the interent or network data can be protected with encryption. If data is stored on a server or a device it can kept safe with the following measures:
password protection
firewall
Anti-virus software
physical control of access to devices through locked rooms.

Event-based threats also need to be considered such as power surges or failures and natural disasters. The best ways to protect from power related threats is to implement the use of an Uninterrupted Power Supply (UPS). For other possible threates to the deletion or destruction of data and the hardware that contains it we need to consider storing it elsewhere.

Archiving is the process of removing data from the current information system that is no longer used. It is then stored elsewhere on another server or perhaps on magnetic tape. This frees up memory space for active data in the system, but still allows access to the archived data if required.

Back Up - Differential & Incremental

A Back Up is a copy of data that is located elsewhere from the active data in use in the system. A full Back Up makes a copy of ALL files on the system. This takes a long time and is often done overnight or on weekends onto magnetic tape. It is not done very often. In some instances only once per month. Often off-site servers are used to control and store back up data.

A Differential backup backs up only the files that changed since the last full back. If the last Full Back Up was done on Sunday 1st June and the only files edited this week were a word document called "Proposal.doc" on Monday and an Excel Spreadsheet called "Budget.xls" on Wednesday, the Differential Back Up on Wednesday will only copy the files Proposal.doc and Budget.xls.

Incremental backups also back up only the changed data, but they only back up the data that has changed since the last backup.

To continue our example abover; our Full Back Up was on Sunday 1st June and an Incremental Back Up is done every night of the week. So Monday night the Proposal.doc file will be backed up, then Wednesday night only Budget.xls will be backed up.

Disposal is required of data that is no longer required and may contain sensitive infomation. eventually all storage devices become filled to capacity and if data is not required to be archived, the data can be removed entirely. In the case of unwanted government computers, before disposal they must undergo special erasure processes to ensure that no one can access any remnants of data left behind.

processing features of a programming language, including instructions, procedures, methods, functions and control structures

An instruction – is an action that a program should carry out. Here are some examples:

strName = txtName.text

DblTotalPrice = DblSalesPrice * IntQuantity

A procedure is a self-contained group of instructions that carry out a specific task, this is also called a routine, subroutine or module. In VB a procedure is a block of Visual Basic statements enclosed by a declaration statement ( Function , Sub , Operator , Get , Set ) and a matching End declaration. All executable statements in Visual Basic must be within some procedure.Here are some examples:

Public Sub Main()
Dim strName As String
strName = InputBox("Enter your name", "Name")
frmStart.Caption = "Hello " & strName
frmStart.Show
End Sub

A Method is an action that can be carried out on an object of a given class. This can change some on the properties of that object. e.g. Button.Show – to make a button visible. Displaying text into an obect is a method. For example:

txtName.text = strName

A Function is a special type of a procedure that calculates and returns a value. See the example below which calculates the Hypotenuse of a triangle:

Function hypotenuse(ByVal side1 As Single, ByVal side2 As Single) As Single
Return Math.Sqrt((side1 ^ 2) + (side2 ^ 2))
End Function

Control Structures – a block of code that controls which lines of code will be executed. There are three types of control structures:

Selection structure, e.g. IF/THEN/ENDIF
Iteration structure – any looping structure such as FOR/NEXT, DO/WHILE
Sequence structure – lines are executed in the order they appear. And GOTO/EXIT

Algorithms for sorting, including selection sort and quick sort and their suitability for a given purpose, measured in terms of algorithm complexity and sort time

Selection Sort

The simplest sort is the Selection Sort which functions the way you would naturally sort items. It looks through the whole list looking for the smallest item and places it in a new list. This is called a PASS. The sort then passes the rest of the unsorted list for the next smallest item and adds it to the new list. It COMPARES the item to the last item in the new list and SWAPS them so they are in order. This process repeats until the new list is completely sorted.

Below is a the VB Code for a Selection Sort.

For more information about Selection Sort, the Wikipedia Page has a very good animation and description and this You Tube Video demonstrates it very well.

Quick Sort

Quick Sort is more sophisitcated sorting technique using Divide and Conquer around a Pivot. There are plenty of examples online that can explain the QuickSort Technique better than I can . This website Understanding QuickSort is an interactive demonstration of how pivots are selected, compared against each item in the array to sort them into either side of the pivot.

This short You Tube Video may assist you in understanding the process. This website has some excellent notes on Quick Sort.

Characteristics of efficient and effective solutions

Efficiency is the avoidance of wasting time, money and effort, while effectiveness is in relation to how the solution meets it requirments.

An efficient solution does its job quickly, doesn't slow down the user's workflow by being confusing, easy to maintain, easy to use, easy to learn and does a lot of work in a short amount of time.

Efficiency:

speed of processing
ease of use
easy to learn to ue
cost of data and filemanipulation in time, money and effort
productivity
operational costs in time,money and effort
level of automation implicit int he system.

An effective solution is:

reliable
maintainable
complete
readable
attractive
clear
accurate
accessible
timely
relevent
useable.

This excellent slideshow by Mark Kelly outlines key exam questions that tests to see if you know the difference between efficiency and effectiveness.

Techniques for checking that coded solutions meet design specifications, including construction of test data

Checking coded solutions requires a testing table of data. Test data should test the boundaries of all conditions in the solution.

For example, the algorithm below reaads in eleven names of Soccer Players into an array called SoccerTeam. It also checks their age and identifies each player as a Senior or Junior.

ALGORITHM

While counter <= 10

name ← Input ("Add the name of the soccer player")

SoccerTeam(counter) ← name

age ← Input (Add the age of the player")

IF age <=18 THEN

Msg (name, is a senior player)

ELSE

Msg (name, is a junior player)

ENDIF

LOOP

In order to test the logic of the algorithm it would be best to desk check it with a set of test data that checks the boundaries used to control the loop and the if statement.

TEST DATA for "age" variables ← 17, 18, 19

There is now no need to test for any other values for this variable. 17 tests all values below 18, 19 tests all values above, and of course you need to ttest what happens when 18 is entered to check what happens.

TEST DATA TABLE - lists the test data and the Expected output (what the program should output correctly) and the Actual output (what the program actually output during testing). Ideally Expected and Actual should be the same - which means your logic in your algorithm is correct. Likewise, running this test data through your newly programmed solution will check it for errors.

Variable (age)

INPUT Expected output Actual Output

17 ...Junior player ...Junior player

18 ...Senior player ...Senior player

19 ...Senior player ...Senior player

When there is a difference between Expected and Actual you have a logic error and you need to go back and re-visit your algorithm.

The above algorithm does not contain any validation so there is no way of Existance checking or Type checking, however, the boundary that controls the loop can also be checked. A testing table of names that goes beyond the 11 the array can hold would test what happens when the user attempts to enter a 12th soccer player.

Mark Kelly PowerPoint on Test Data

Validation techniques, including existence checking, range checking and type checking

The three main validation techniques test input from the user to ensure the data is able to be processed.

1. Existence Checking tests to see if the data exits. For example, if the user has not entered their Name data into the form then the Form can not be processed. Validation would alert the user to enter data that has not been entered.

2. Range Checking tests to see if the entered data is within a set range. For example a post code in Australia consists of only 4 values. Postcodes in NSW all start with "2". If NSW is a restriction to the post code then a range check can test for input to be between 2000 and 2999. A common range check exists with date of birth data. This type of range check can check if the user's Date of Birth makes them eligible over 18 or to test if the data is valid. Anyone entering a year over 100 years may be considered out of range.

3. Type Checking tests to see if the data type is appropriate for the processes. Setting a Type Check on a postcode can further limit data entry to values returning a warning if letters or other symbols are entered.

Mark Kelly PowerPoint on Validation

Techniques for testing the useability of solutions and forms of documenting test results

When a user interacts with systems or products such as apps, websites, devices or software, the quality of their experience is referred to as usability. It is about the overall satisfaction, efficiency and effectiveness for the user. This is black box testing wher the tester does not know the code that makes the solution.

Usability Testing tests the following features of the software.

How easy it is to use the software.
How easy it is to learn the software.
How convenient is the software to end user.

To test the useability there are a range of different tests a developer to do:

1. Create a mockup interface for solution either on a screen or on paper and use with test subjects and ask them for their impressions on how it will be used.

2. Test a working prototype with potential end users and observer their interaction with the solution. Write down observations.

3. Provide online chat or forms to allow users to raise issues or grievances with the use of the solution.

4. Collect user feedback through surveys. Make comparisons between the old system and the new solution.

5. Create bespoke tests for app solution with a full set of test data that tests limits to ranges and other limits.

Another Mark Kelly PowerPoint on Testing

Techniques for recording the progress of projects, including annotations, adjustments to tasks and timeframes, and logs

The role of the project plan is not to remain static, but to be adapted with variation in times, resources and scheduling. It is important that the project plan is monitored throughout the life of the project. when changes are made to the nature of tasks, annotations need to be included so they can be tracked. Project Managers should keep a log of all actions and changes to the plan with dates and team member identification.

As part of the SD SAT it is advisable that the Gantt Chart is updated, and annotations are made through the life of the project so that all logs are available for submission at the end of the project.

Factors that influence the effectiveness of project plans

The order of the tasks is important. The critical path of key tasks need to define the work flow from one milestone to the next when milestones are dependent on each other.

The length of time allocated for each task is going to affect how successful the project management is.

The allocation of appropriate human, hardware and software resources as well as data and information is crucial for the plan's success.

All team members shoul dbe using the project plan and adapting it as the project progresses.

Incorporating a contingency plan if there is a issue during part of the the project's development.

Strategies for evaluating the efficiency and effectiveness of solutions and project plans.

Evaluating Efficiency

There are three ways to evaluate efficiency of a software solution:

Speed of Processing - does it complete the task without an unfeasible delay

Functionality - how well does the solution do what it was designed to do

Cost of File Manipulation - how many times are files manipulated? Minimum manipulation increases efficiency.

Evaluating Effectivness

There are many ways to evaluate effectivness of a software solution:

Completeness - Is all the output complete? Does the user need anything outside of the system to do the task?

Attractiveness - Has the design elements created a solution that is appealing and clearly realted to the system's function?

Clarity/Readability - Is it clear enough for users to read to complete a task using the system?

Accuracy - Is the output from the system always correct?

Accessibility - Is the interface easy to use for all the users? The design of the interface needs to pay attention to the characteristics and limitatin of end users.

Timeliness - Is the entry, processing of data and the production of output information completed in a time frame that allows practical use of th eoutput?

Communication of Message -

Relevance - Is the data that has been included in the output relevent to the needs of the user?

Useability - Given the user characteristics, how easy is the solution to use?