Importing CSV Data Into MySQL
Reading a CSV file and importing it into a MySQL Database With NodeJS

We’re going to be looking at how to read a CSV file and how to dump that data into a MySQL database.
Setting Everything Up
Lets go ahead and make sure we have these dependencies installed and imported
const fs = require('fs');
const mysql = require('mysql');
const csv = require('fast-csv');
Loading The CSV File
Next lets go ahead and create a read the data into an array
let stream = fs.createReadStream("my_file.csv");
Now we have our stream, lets use the CSV module and parse it. Note we listen for two events (data, end). The data event is when a new line of the CSV has been parsed, and the end event is once the file has been completely read.
let myArray = [];
let csvStream = csv
.parse()
.on("data", (data) => {
myArray.push(data);
})
.on("end", () => {
});
stream.pipe(csvStream);
For those of you who have already looked at a CSV file in a text editor, you may already be aware that the first line of the file is the column headers. We want to get rid of those, so lets just shift the array. This removes the first element of the array.
.on("end", function () {
myArray.shift();
}
Inserting Multiple Records
I have already covered how to insert multiple records easily in my post Insert Multiple Records Into MySQL. It may be worth you reading that before getting started on inserting the data into a database..
On the event event, we’re going to just open up a new database connection, and bulk insert the records into the database using the MySQL’s native bulk insert functionality.
.on("end", function () {
// remove the first element in the array
myArray.shift();
// create a new connection to the database
const connection = mysql.createConnection({
host: 'localhost',
user: 'username',
password: 'super_secure_password',
database: 'database_name'
});
// open the connection
connection.connect((error) => {
if (error) {
console.error(error);
} else {
let query: string = 'INSERT INTO table (field1, field2, field3, field4) VALUES ?';
connection.query(query, [myArray], (error, response) => {
if(error) {
console.log(error);
} else {
console.log(response); // x records inserted
});
}
});
}
Known Bugs
Packet Too Large
There’s a few issues I’ve had when implementing this. As discussed in the Insert Multiple Records Into MySQL, having a large CSV file can lead to the database throwing an error due to a large packet file exceeding the ‘maximum_allowed_packet’ variable. You can see an easy fix below for that.
I was trying to bulk insert around 180k records from a CSV file, and I kept getting an error. When I traced the error back to MySQL output in the console, I could see that my MySQL server was running with a fairly small ‘maximum_allowed_packet’.
In order to check what packet size you have set currently, just run the following command, and the one underneath it to update the variable.
show variables like 'max_allowed_packet';
set global max_allowed_packet=33554432;
Incorrect DateTime For MySQL
Something else I had to deal with, is the date field in the CSV did not match the DateTime field in the MySQL database. In order to solve this, I just looped through the array and replaced correct index of the element with a converted value. Something along the lines of this:
.on("end", function () {
myArray.shift();
for (let i = 0, len = myArray.length; i < len; i++) {
myArray[i][3] = convertToMysql(myArray[i][3]);
}
// ...
The ‘convertToMySQL on line 4 is a custom function I wrote. You will have to write your own as your input may differ from mine.
Side Note: If you don’t want to shift the array to remove the first index, you can pass an options object to the CSV Stream, see below
csv
.fromStream(stream, {headers : true})
Errors In The CSV
This one is kind of situational, but you can listen to two events that may help you here. One is the ‘error’ event, and the other is the ‘data-invalid’ event. The error event gets triggered when an error occurs such as the file not found, or not being CSV etc. The data-invalid will be called when a row has invalid data on it.
.on('error', (error) {
// error event handler
});
.on('data-invalid', (data) => {
// data was invalid
});
Side Note: “Any rows consisting of nothing but empty strings and/or commas will be skipped, without emitting a ‘data’ or ‘error’ event.”
Whole Thing
If you want to double check your code after reading through; it might look something like below.
const fs = require('fs');
const mysql = require('mysql');
const csv = require('fast-csv');
let stream = fs.createReadStream("path_to_my_file.csv");
let myData = [];
let csvStream = csv
.parse()
.on("data", function (data) {
myData.push(data);
})
.on("end", function () {
myData.shift();
// create a new connection to the database
const connection = mysql.createConnection({
host: 'localhost',
user: 'username',
password: 'super_secure_password',
database: 'database_name'
});
// open the connection
connection.connect((error) => {
if (error) {
console.error(error);
} else {
let query = 'INSERT INTO table (field1, field2, field3, fieldn) VALUES ?';
connection.query(query, [myData], (error, response) => {
console.log(error || response);
});
}
});
});
stream.pipe(csvStream);
And that just about does it for this. I hope you guys found it useful! If you have any issues or suggestions, please don’t hesitate to comment!
If there are millions of records then it myData will increase node memory
Thanks this was awesome.
Only issue I had was “con.query” instead of “connection.query”
Ah good catch! Will change that now, thanks š
Thank you for this.
No problem, glad you found this useful.