Imagine... 
							Your results are amazing!
							but wrong
						
						
		
						
						Mike Konczal
					“all I can hope is that future historians note that one of the core empirical points providing the intellectual foundation for the global move to austerity in the early 2010s was based on someone accidentally not updating a row formula in Excel”
						
		
					
					What were the real errors?
					
						- They used Excel (subject to debate)
- They didn't share their code and data (Vital!)
							This 2003 trial, done in Kenya, found that deworming whole schools improved children’s health, school performance, and school attendance.
							In 2013, the data was reanalysed independently using new computer programs
							Many mistakes found.
						
We have a problem!
							 
						
		
						
							Croucher's law
							I can be an idiot and WILL make mistakes.
							You are no different!2>
						
		
		
						
							Your Analysis?
							What you did
							Open package foo. Click, Click, drag, Click, Click, Click, Right-Click, Save, 'results.csv'.
							Load into Excel. Click, drag, generate graph, right click, save, 'pretty-graph.png'
						
						
							Your Analysis?
							What you said
							 I analysed my data in foo using the bar analysis. Here's a graph of the results.
						
						
							How reproducible is a mouse click?
						
					
						Automate
						aka 'learn to program'
					
					
						The Ideal
						Results = TheAnalysis(MyData)
						 
					
					
						Reality
						 
					
				
		
				
					Problem
						I am an idiot and will make mistakes
		
					(Partial) Solutions
						
							- Automate (aka learn to program)
						
Write code in a (very) high-level language
						 
					
		
					
					
						Why high level languages?
						"Programmers write roughly the same number of lines of code per unit time regardless of the language they use" (Best Practices for Scientific Computing, PLOS Biology, Wilson Et Al)
					
					
						What about speed?
						
							-  Computer time is cheap. Programmer time is expensive.
							
-  We all have supercomputers now!
							
-  Ensure it's correct, then worry about speed.
							
-  Call NAG to help with the slow bits
							
Problem
						I am an idiot and will make mistakes
		
					(Partial) Solutions
						
							- Automate (aka learn to program)
							
- Write code in a (very) high-level language
						
Two facts that, combined, worry me: Scientists typically spend 30% or more of their time developing software
						 90% or more of them are primarily self-taught
						 Hannay JE, Langtangen HP, MacLeod C, Pfahl D, Singer J, et al.. (2009) How do scientists develop and use scientific software? In: Proceedings Second International Workshop on Software Engineering for Computational Science and Engineering. pp. 1–8. doi:10.1109/SECSE.2009.5069155.
						
							Prabhu P, Jablin TB, Raman A, Zhang Y, Huang J, et al.. (2011) A survey of the practice of computational science. In: Proceedings 24th ACM/IEEE Conference on High Performance Computing, Networking, Storage and Analysis. pp. 19:1–19:12. doi:10.1145/2063348.2063374.
					
					
					
		
				
					Problem
						I am an idiot and will make mistakes
		
					(Partial) Solutions
						
							- Automate (aka learn to program)
							
- Write code in a (very) high-level language
							
- Get some training
						
Is this familiar?
					
					-  code_ver1.m
					
-  code_ver1b_BROKEN.m
					
-  code_ver1b_BROKEN_Working_march20.m
					
-  code_ver1b_BROKEN_Working_march20_Bobs_mods_ForMike.m
					
Which version did the results come from?
				
				
				
				
					Taking you back to your happy place
				
			
				True Story 
				
					-  Me: Can I see the code please?
					
-  Them: I'll just get the changes from Bob folded in and email it
					
-  Me: Shouldn't we be using version control?
					
-  Them: No need - it's overkill. We don't have a VC problem.
					
-  Me: The code you sent me doesn't work
					
-  Them: Sorry. I sent the wrong version.
				
Which version control system should you use?
				I like and use 'git' but use whatever your colleagues are using.
			
				
		
			
				Problem
					I am an idiot and will make mistakes
		
				(Partial) Solutions
					
						- Automate (aka learn to program)
						
- Write code in a (very) high-level language
						
- Get some training
						
- Use version control
					
Get a code buddy (Code Review Light)
				 Doesn't have to understand your research
				 Remit: Tell me where I could do better?
				 Problem 1: Get the code running on THEIR machine
			
			
				Get a code buddy
				 
			
		
		
			Problem
				I am an idiot and will make mistakes
		
			(Partial) Solutions
				
					- Automate (aka learn to program)
					
- Write code in a (very) high-level language
					
- Get some training
					
- Use version control
					
- Get a code buddy
				
Share your code and data openly
(As possible)
						
					
					
						You've come so far...
						
							-  You can get your results by entering one command
							
-  Your code buddy has seen your code -> Show it to the world
							
-  Your code is in git -> upload to public github
						
Benefits
						
							 -  It's the right thing to do
							
-  Others will use, debug and enhance your work
							
-  Others will reproduce and cite your work
							
-  More opportunities to collaborate
						
Openly as possible?
							If can't be fully open, be as open as possible within your organisation
						
		
				
					Literate computing
					Traditional reports are just advertisements
				
		
					
		
					
		
					
						A Literate computing document IS the research
					
		
					
						Literate computing technologies
						
					
		
				
					Problem
						I am an idiot and will make mistakes
		
					(Partial) Solutions
						
							- Automate (aka learn to program)
							
- Write code in a (very) high-level language
							
- Get some training
							
- Use version control
							
- Get a code buddy 
							
- Share your code and data openly
							
- Use literate computing technologies
Afraid to change your code?
						 
					
		
	
						
							Write tests
							
								- Every decent language has a testing framework
								
- Learn how to use it
								
- You write additional code that ensures your code gives the answers you expect
								
- Tests give you confidence to make changes
							
		$ nosetests ./unittests.py
		..............................
		----------------------------------------------------------------------
		Ran 30 tests in 0.152s
		
		OK
		
		
		
					
				
		
		
				Problem
					I am an idiot and will make mistakes
		
				(Partial) Solutions
					
						- Automate (aka learn to program)
						
- Write code in a (very) high-level language
						
- Get some training
						
- Use version control
						
- Get a code buddy 
						
- Share your code and data openly
						
- Use literate computing technologies
- Write tests
					
Numerical Computing is hard!
		 
		 
				Hypothenuse of a triangle
				  Easy!
					h = sqrt(x*x + y*y)
					So why is the hypot function in math.h, Python and MATLAB?
			
			
					A better hypot
					
						
						
max = maximum(|x|, |y|)
min = minimum(|x|, |y|)
r = min / max
return max*sqrt(1 + r*r)
				
					
					
			
    
		
		
		
				A story of sin(x)
				Who do the experts go to for help?
			
			
			
					Numerical Computing is hard!
					....and it's getting harder!
	
	
			Parallelisation and adding up
			ans = 0.1 + 0.2 + 0.3
			order matters!
	
				order matters
				x = (0.1+0.2) + 0.3 = 0.60000000000000009
        y = 0.1 + (0.2+0.3) = 0.59999999999999998
	
		Technological drivers
		
		-  Many core parallelism
-  Low precision arithmetic
-  Vectorisation
-  Exotic hardware
Use a quality numerical library
	
		The NAG Library
		1900+ routines
		Python, MATLAB, C, C++, Java, Fortran etc etc
		
						
		
				(Partial) Solutions
					
						- Automate (aka learn to program)
						
- Write code in a (very) high-level language
						
- Get some training
						
- Use version control
						
- Get a code buddy 
						
- Share your code and data openly
						
- Use literate computing technologies
- Write tests
						
- Use a quality numerical library
NAG
					
							- Consultancy (We help you do stuff)
							
- Products (We build stuff)
							
- Training (We teach you stuff)
							
- Research (We develop new stuff)
							
- Open source (We give some stuff away)
							
- Community
					
NAG - Some Academic Collaborations
					
						- Your University (Your research)	
						
- University of Oxford (Mathematical Optimisation)
							
- University of Manchester (Accelerated Linear Algebra)
						    
- University of Lancaster (wavelet models)			
							
- University of Sheffield (ML - Clustering)
							
- RWTH Aachen University (Algorithmic Differentiation)
							
- QMUL (Implied Volatility)
							
							
					
Where my ideas came from: Twitter