Saving Data
This method is suggested on the official jsPsych website here: https://www.jspsych.org/overview/data/#storing-data-permanently-as-a-file, and the tutorial listed here: https://kywch.github.io/jsPsych-in-Qualtrics/save-php/, and so I would presume it is currently in use by many researchers worldwide. As long as you haven’t asked participants to enter identifying information into your cognitive task specifically, this data is anonymous and not sensitive - it can only be tied to specific participants using their random identifier stored in Qualtrics. I have tried to make some security improvements as listed below. So you don’t have to take my word for it, I’ve sourced where I got all the information from to do so.
Considering the current pandemic, I hope this guide can help others like me who suddenly find the need to run online experiments on a small budget. If anyone has any suggestions for improvements, feel free to email me at m.lovell@sussex.ac.uk and I will be happy to update the document. I also don’t intend to take credit for much of the code below, and I owe many thanks to Kyoung Whan Choe for his excellent tutorial on which the below is very heavily based - I only list my changes to his method here, and direct you to the original tutorial where possible. Without the original tutorial to build off, I wouldn’t have had the time to TRY to improve on it here - I would’ve spent the whole time trying to get the project off the ground!
Server space set-up
Going back to the tutorials on kywch.github, the section on saving data to a web server with php is not entirely applicable to members of the university of Sussex. You can see the university’s guidance on data storage here. Before you can follow along with the kywch.github PHP tutorial, you will need to set up a personal web space with the university, and follow this guide for authorising your computer to access your personal (N) drive. As a Mac user, I noted that the STFP (essentially just software for transferring files) recommended by the university was not on the App Store, and I opted to go for the free and open-source program called ‘FileZilla’ which is widely used. Perhaps windows and Linux users would also want to use FileZilla, as that is what I’ll be referring to in this tutorial. Note when signing-in that the host is `sftp://unix.sussex.ac.uk` and the port is ’22’. Finally, you can open your command line terminal and type the command ssh [username]@unix.sussex.ac.uk
and log in with your university password.
Beginning the tutorial
You can now follow the steps in the guide 'Saving Data with PHP’ which lead you to make the exp_data directory within public_html, although stop there as we’ll do things a little differently.
Firstly, the line suggesting $ echo "DirectoryIndex index.html" >> .htaccess
(so $ touch index.html
isn’t needed either) won’t work. We aren’t allowed to redefine these values, and this is the default behaviour on the university servers anyway. Secondly, the line $ chmod 772 hello-world
won’t work on the university servers. Instead, the highest security we could offer is chmod 1703
, you can find more about these permission codes here (note the first `1` refers to a ‘sticky bit’).
Instead of doing this, we can increase security somewhat by following the steps found here: http://www.mysql-apache-php.com/fileupload-security.htm and integrating them with the guide from the Kywch website above. That means we won’t need the ‘hello-world’ directory within exp_data - your exp_data directory only needs the save_data.php file within it, having any subdirectories within that folder could pose issues.
Two quick side notes relating to the tutorial: file_uploads are already switched on - your php.ini file is located in a hidden subfolder called ‘/etc’ (how to), and you can see we are running Apache by putting your username in the following bash code: curl -s -I https://users.sussex.ac.uk/[username]/|grep Server
.
Changes to the PHP file
I would suggest making your save_data.php file in your IDE and moving it to the correct location (exp_data) with FileZilla, rather than using Vim. I’ve made some changes to the php file that should increase security somewhat - you can see that file here: https://github.com/Max-Lovell/online-experiments/blob/main/save_data.php. I’ll go through what these changes are and why they’ve been made, in order of appearance - skip this section if you’re not interested!
Blocking access from the save_data URL
if ( $_SERVER['REQUEST_METHOD']=='GET’
these first few lines of code make it so that people can’t just see whatever error message the php file throws up when you try to access it through the standard URL. Going to something like https://users.sussex.ac.uk/[YOUR_USERNAME]/exp_data/save_data.php it will look like nothing’s there, although if you take it out you’ll see an error message that reveals something about the code that’s best left hidden.
CORS (access control)
The line header('Access-Control-Allow-Origin: *’)
in save_data.php doesn’t seem to work. Comments in the php file do suggest changing it to header('Access-Control-Allow-Origin: https://ssd.az1.qualtrics.com')
for tighter security (the '*' means 'allow any website'). If you go to your developer console when running the study in Qualtrics you would find that this throws an error. If you are using the Sussex Qualtrics account, you would need to change the allowed origins to https://universityofsussex.eu.qualtrics.com to make this error go away. However, this still doesn’t work - whatever we put in the allowed origins section, the files will still write to our folder, which isn’t good. Following the highest upvoted comment in the Stack Overflow question linked above, you might want to just opt to remove this code, as I assume our URL is well protected. However, we can follow the example at the bottom of this page and instead wrap our entire php code after <?php
and before ?>
with the following: if($_SERVER['HTTP_ORIGIN'] == 'https://universityofsussex.eu.qualtrics.com') {
...reset of code... } else { exit; }
. You can test that this works by changing the allowed origin in the php file and trying out the Qualtrics program. If one wanted to implement CORS better than this, the answer on stack overflow is a good place to start, along with this and this, apparently.
Blocking executables
The chunk of code that declares a $blacklist
is straight from mysql-apache-php.com, and I’ve placed it up top to catch any executable files being uploaded immediately.
Sanitising data
Apparently, the filter_input_array
from the Kywch.github tutorial isn’t actually for stopping xss attacks, but I recon it’s better to call this function before we even touch our json file, so I’ve put it up here. It basically removes any characters and strings that might be dangerous before the computer even touches them.
Checking for JSON files
The guide at mysql-apache-php.com checks if what is uploaded is an image file. This isn’t applicable to us, but we can check if a .json file has been uploaded with that json_decode()
and last_error code adapted from the answers here.
Checking our variables exist
The if (isset($_POST…
stuff just checks if the variables we are getting from the jQuery.ajax() function we put into Qualtrics exist; if they do, they are assigned to a PHP variable - if any of them aren’t there, the program exits. You might notice, if you looked at the original tutorial, that there’s no mention of ‘data_dir’ anymore - it’s better to not send the location of where we store our files over the internet, so I’ve removed it and we’re going to be putting what was in data_dir straight into the bottom of the php file. This means that you will need to make a new (or edit this) php file for each new experiment with that new location - this should make more sense once we get to that section.
Moving the file contents
file_put_contents($data_dir.'/'.$file_name, $exp_data);
has become file_put_contents('../../hello-world/hello_'.$file_name, $exp_data);
. This function is the bit that grabs the data and puts it in the folder we want. Since we’re not getting that information from our Qualtrics survey, it’s been hard-coded in, hence the ‘hello-world’, which you’ll want to change with something more relevant to your own experiment (we’ll get to making this folder in a moment). The reason I’ve added ../../
is that, according to the tutorial at mysql-apache-php.com and a few others online, it’s a good idea to not have our data accessible by a public URL. Instead, we’ll make our uploads folder out of public_html (our ‘wwwroot’). If your current save_data.php file is in a subfolder of ‘public_html’ called ‘exp_data’, as the Kywch.github example suggests, then we’ll need to go up two folder levels. ../
means going up a folder level from wherever you currently are (i.e. where the save_data.php file we’re writing in is). So, to get go up two levels and get out of public_html we use ../../
. The reason for hello_'.$file_name
is that file_name is just going to contain the Qualtrics participant ID and nothing else - the less sent online the better. The full-stop is how you concatenate the $file_name variable onto the `hello_` string.
BASH
Now, we’re going to create the folder where our data will upload (in the example on kywch.github it’s called ‘hello-world’). We’re going to make this outside of the ‘wwwroot’ - i.e. outside our public_html folder, and so it won’t be accessible using a URL in a browser. We can move up a directory level in terminal with the command cd ..
(Note the space). Do this until you are in the folder you began in after signing in with `ssh`, before moving into the public_html folder - you can use the bash command pwd
to make sure you are in /its/home/[username], or ‘ls’ to make sure you can see the public_html folder listed. Here is where you want to create the directory where you want to save your data (i.e. ‘hello-world’). Assign it the permission chmod 1703
.
We can also set our other permissions slightly differently. exp_data requires execute permissions only as your JavaScript links directly to that file, hence you can assign it the permission chmod 1701
. Any permissions above 0 on the save_data.php file are allowed, and limiting this to read only would be the best idea. Hence, assign this file chmod 1704
. Finally, change the permissions of public_html to 1701
.
.htaccess
Next, following the example in the mysql-apache-php.com post above, in your script editor, create a file called ‘.htaccess’ with the following content:
Options -Indexes Options -ExecCGI AddHandler cgi-script .php .php3 .php4 .phtml .pl .py .jsp .asp .htm .shtml .sh .cgi <Files ^(*.json)> order deny,allow deny from all </Files>
Note, If you are dealing with csv files and not json, change <Files ^(*.json)>
to <Files ^(*.csv)>
. These are explained here and here. You can then put this .htaccess file into the uploads folder (e.g. hello-world) using FileZilla. Note that this file might be hidden - for example, if you are on a Mac, you will need to use .+⇧SHIFT+⌘CMD in finder to reveal hidden files.
Qualtrics and JavaScript changes
Now, follow along with the rest of the tutorial at kywch.github regarding how to add JavaScript code in order to save your data. However, you can make the following changes to that technique when you are done. Firstly, we won’t be needing most of this:
var task_name = "hello-world"; var sbj_id = "${e://Field/workerId}"; var save_url = "https://users.rcc.uchicago.edu/~kywch/exp_data/save_data.php"; var data_dir = task_name; var file_name = task_name + '_' + sbj_id;
All we need are the sbj_id and save_url variables. Let’s now make the following changes to the jQuery.ajax() function:
function save_data_json() { jQuery.ajax({ method: 'POST', dataType: 'json', cache: false, url: save_url, data: { file_name: sbj_id + '.json', exp_data: jsPsych.data.get().json() } }); }
A few things have happened here. file_name has been changed to just have the subject id, and there’s no data_dir. The reason for this is that we’re limiting all of the information we’re sending over the internet - these things are all just hard-coded into our php file. The `type` function is depreciated and so we’ll change that to method:'POST’
instead. Finally, it’s recommended that we specify what data type we are sending through the jquery.ajax() function, and so I’ve added dataType: ‘json’
too.
Downloading the data
Make sure you have followed all the other steps in the tutorial at kywch.github, then try running your experiment in Qualtrics and hopefully the data should save to your target upload directory (e.g. ‘hello-world’) - you will need to refresh FileZilla before you can see the file appear. Since we can’t just access the data through a public URL anymore, we need to download it in a slightly different way than is suggested in the article. As our data is now housed outside public_html, and as such is a ‘protected system folder’, we’ll need to navigate to the left-hand panel of FileZilla called ‘local site’, which lists the contents of our own computers, create a directory where we want to put our data (e.g. navigate to the ‘documents’ folder on your computer and create ‘exp_data’). Then, when we right-click -> download on the remote site on the right-hand side of the screen, the files will be downloaded to this folder, and we can find them on our own computers. If you have a large amount of data, you may want to create a zip folder as suggested in Kyoung’s example. More info here and here.
You can find my PHP and .htaccess file at my GitHub repo here.
Servers
Node.js
https://nodejs.org/en/Django
https://www.djangoproject.com/Databasing Google firebase: https://firebase.google.com/pricing JATOS: https://www.jatos.org/ MySQL sussex: https://www.sussex.ac.uk/its/help/faq.php?faqid=2695 jsPsych: https://www.jspsych.org/overview/data/#storing-data-permanently-in-a-mysql-database Edinburgh uni: https://www.ed.ac.uk/information-services/computing/audio-visual-multi-media/web-hosting/hosting-service-options Use drop-box: https://kywch.github.io/jsPsych-in-Qualtrics/save-dropbox/ Box-app: https://developer.box.com/guides/tooling/cli/quick-start/create-jwt-app/ Useful resources for going down this route will be the Node.js SDK https://developer.box.com/sdks-and-tools/ the JavaScript SDK for box https://github.com/box/box-node-sdk#readme the qualtrics JavaScript API class and information on application scopes https://developer.box.com/guides/api-calls/permissions-and-errors/scopes/