Tiny Tiny RSS Migration

Tiny Tiny RSS Migration

♦ Nothing is simple. You learn that after a few decades in IT. All you can see are the tops of icebergs.

Tiny Tiny RSS

Tiny Tiny RSS (ttrss) is a fantastic application and tends to work flawlessly, once installed. I’ll repeat that last part – once installed, as installation often comes with a few challenges.

It has been embraced after the demise of Google Reader and has an extensive user base.

01-Aug-2008 13:18
 

Software Engineers oversee another Tiny-Tiny RSS installation

ttrss is comprised of a php front-end and can connect to either a MySQL or PostgreSQL db backend. I favour the MySQL flavour, but word on the street is that PostgreSQL performs better. As I already have MySQL installed for WordPress on the same box I thought it overkill to install PostgreSQL just for ttrss.

Step 1 – DB Migration
Step 2 – Installing ttrss on the new host
Step 3 – Install ttrss plugins
Step 4 – Verify installation

Once installed, you’ll need to schedule regular feed updates.

MySQL ttrss db Schema

Diagram produced using MySQL Workbench 6.0 which is now available for Ubuntu: sudo apt-get install mysql-workbench

mysql> create database ttrss;
mysql> use ttrss;
mysql> source ttrss_schema_mysql.sql;

NB: the above creates just the schema (little data except what is required to bootstrap the installation process)

TK: Add schema sql file to Upload dir and allow downloads.

When you import the data proper, you can overwrite the existing schema, use as is or use a seperate database. The options are many!

DB Migration

The current project that I am working on requires me to migrate the ttrss db over to a new server (host).
Thought all it would take is a couple of db dumps and the job would be done. I slithered out of bed at 2AM on Saturday morning to make an early start on this task yet feel like I’ll spend most of the day getting it sorted. (Update: Sunday afternoon and I’m still going but the bulk of the work has been completed.)

NB: If you have a sandbox available I suggest you use it to iron out any hurdles you might encounter to make your life easier. Creating a blank schema and then ensuring your web browser can load and run the ttrss install script (connecting to your empty db) is highly recommended.

Then attack your production environment with all the gusto you can muster.

Impenetrable Import

The ttrss_entries table can grow quite large when it contains more than a few thousand entries. This is because the content column essentially contains the source of entire web pages.

When I got to the import stage of the process it fell over with the error:

Red herring #1.

Looks like some rogue data has crept into the table and that is causing the INSERT statement to trip over itself. Time to whip out GVIM as other editors won’t be able to handle a file of this size without choking. That means I’ll need to look up some VIM commands as well 🙁

So I yanked the INSERT statement out of the original file and put it under the microscope. The search term I tried was ‘ctl00_p_wpcpage’ which is referenced in the error message, but there are simply too many instances in the data. No dice.

What clues do I have?

When I trash the table it reports that 5,344 rows were deleted. That suggests it fell over on the next row, being 5345. Right? At least some data was imported, I guess.

I then began to pursue another approach. I restarted MySQL with the verbose option and use the source directive to load the INSERT values. Now I’m staring at a screen with whirling data as if I’m watching the Matrix.

(*thinks* This is taking an ungodly amount of time .. seriously)

Wow – it actually finished. When it fell over again she revealed a few more of her secrets:

Red herring #2.

I’m also having a sense of deja vu re a previous encoding issue that I had with ttrss. It involved moving to the utf8mb4 charset which has full UNICODE support unlike utf8. Check out this great article on that front. And there’s another one here detailing how to convert to the utf8mb4 collation format.

(It wasn’t!)

OK, so let’s take a look at the source tbl structure:

We can see that the collation on the content column is utf8_general_ci.

Is it the same on the destination table?

We can check using: mysql> SHOW FULL COLUMNS FROM ttrss_entries;

Same col definition and collation. That’s not it 🙁 Gotta keep digging further.

I already suspect something in the content field, but it will be difficult to view the column value using an SQL client due to it’s size. Maybe better to output the value to file using the INTO OUTFILE directive.

When trying to export to screen it chokes. When trying to output to file, I’m unable to locate the file produced. (See below for more info on why)

Looks like I’ll have to go back to VIM to extract the offending data. Feels impossible, like the proverbial ‘needle in a haystack.’

I also ran it on a localhost mysql install on the laptop. Same error. Has to be something in the data right? Or the way it was exported. (Hint: re-read that last statement.)

Maybe I should take a look at the error log? First a peek at /etc/mysql/conf.d to see where the logfile resides:

log_error = /var/log/mysql/error.log

Nothing there! Sigh.

iPage has limitations

Some can be worked around while others can not1. You are stuck with using PHPMyAdmin through their website for all db admin tasks. It is what it is! 🙁

Simple SQL Math: No shell access + No remote db access = lower productivity

Having tried various approaches to export the db over the weekend, I think I can rule out the following in my situation:

(1) Using the mysql CLI. I see the following error in the php error log:

(2) Using mysqldump to export a tab delimited file

Getting a similar error to the one above.

(3) Using ‘INTO OUTFILE’ via a SELECT query from PHPmyAdmin.

I think that the problem here is that the MySQL processes are quite possibly most definitely running on a different box than the webserver. When I ping the SQL box I get an IP address of 38.113.1.135 whereas iPage.com resolves to 65.254.244.120. Large hosts surely have distributed architectures.

As others have discovered, you might be able to execute the SQL but never be able to locate the file produced.

The iPage kb articles only describe using PHPMyAdmin to export databases. So it’s pretty much a ‘closed system’ – I’m surprised in hindsight that mysqldump works, even if it has it’s limitations.

The Breakthrough

Finally!

I headed back to the PHPMyAdmin export screen figuring that they must allow the dump file to be saved somewhere. Once I spotted the row restriction filter I began to play around with it – lo and behold I was able to pump out a 100 MB dump file, and the whole table was dumped into 3 separate files. I can live with that!

The -- extended-insert=FALSE option exports data with individual INSERT statements4.

This makes troubleshooting MUCH easier.

The individual export files (3 of them) all imported without error. Yipee. Just took all day.

Ok, now lets improve on that – to force individual INSERT statements to be dumped via MySQL we add the
--extended-insert = FALSE option to the command. Works like a charm!

So my dump script in it’s entirety is as follows:

Running this script and using the mysql source command imports the schema and data without any errors as expected. A sigh of relief.

Import post-mortem

In summary: not using the --extended-insert = FALSE mysqldump option caused the data to be exported in a format that proved too complex for the import process due to the structure and size of the contents column in the ttrss_entries table.

A StackOverflow article here describes a similar problem encountered by another user.

In no particular order:

Learn some VIM commands

This one is particularly useful for splitting files
:.,$d
delete to end of file

Importing data is not always a trivial exercise.

Depends on the structure of the table and the amount of data.

Error messages aren’t always specific enough to be useful.

It’s the cryptic ones that are hard to work with – look for verbose logging options where available and know where your server stores the various error logs.

The limitation imposed by a webhost can sometimes be a serious impediment to productivity.

When you’re caught up in a cycle of editing script files, uploading and executing them and then checking error logs it can chew through the hours pretty quickly.

When trying to solve a difficult problem try different approaches.

Take time away from the keyboard to allow the subconcious mind to work the problem. Writing things down (or blogging) helps in lieu of having someone else available to discuss the problem.

Guess I’ll have to chalk this one up to that old adage – “It’s easy when you know how”. In the end it was as simple as it should have been provided you were intimately familiar with mysqldump.

That which hurts, teaches

Indeed.

ttrss installation and configuration

You can find a more detailed guide over at LifeHacker. As always I just sprinkle a few breadcrumbs where I feel it’s appropriate.

Tip: Do a fresh install – don’t migrate old files.

git clone https://tt-rss.org/git/tt-rss.git tt-rss
sudo chown -R www-data:www-data tt-rss

You may need to change permissions on some of the ttrss files using:

Configuration file: These are the settings required to connect to the database. If you used the install script you would have been provided with a file based on your input to save. Of course you can edit the file manually.

Replace uid and pwd with your db user id and password respectively

Install tt-rss plugins

I’m using feediron so I needed to zip up the folder from the old server, move it across to the new and then change the file ownership to www-data.

You then need to log into ttrss and activate the plugin. After logging in there is a Actions drop-down on the right hand side of the window. Select Preferences and you will be taken to another screen where there is a Plugins bar. Click that bar and all the plugins will be listed. Additionally once activated, a plugin will host it’s own tab on the same screen where configuration changes can be made. (See below)

Verify Installation

DB:

Quick and dirty: Row count comparison for the ttrss_entries tables in both databases.
Schema comparison: MySQL-diff

TTRSS:

Ensure you can log in after pointing your browser to the host where ttrss has been installed. You will be presented with the login screen.

Note that the default uid/password is admin/password. (Please change these after your installation!)

You should then see a screen similar to the following:

Footnotes


1. Important: If your database is larger than 10MB, you may want to download separate groups of database tables instead of everything all at once. We recommend this because the maximum file size you can upload when restoring your database is 10MB (10,240KB).

2. BTW, came across this little gem in my efforts to resolve this issue: pv sqlfile.sql | mysql -uxxx -pxxxx dbname. PipeViewer – it displays a progress bar for lengthy operations. Great stuff

3. –extended-inserts uses multiple-row INSERT syntax. Setting it to FALSE uses single-row INSERT syntax.

4. TK: stop autoconversion to mdash as per – http://wordpress.stackexchange.com/questions/60379/how-to-prevent-automatic-conversion-of-dashes-to-ndash