Child pages
  • Fix Database Encoding and Make Search Case-Insensitive (v3.2)
Skip to end of metadata
Go to start of metadata

In Newscoop versions older than 3.2 the server database encoding was not set properly to UTF-8. Also, the client database connection encoding was not set and usually the default is latin. Because of this the data was saved encoded usually in latin, although the entered text was encoded initially in UTF-8.

By default MySQL string matching is case insensitive but this doesn't work for other languages except English because of incorrect encoding.

Newscoop 3.2 fixes the encoding issue by setting both the server side and the client side encoding to UTF-8. However, this fix only works for newly created databases. The existing databases would still be encoded in latin
- or whatever was your MySQL default
- on the server side. In order to fix this issue you must take the following steps:

  1. Backup your Newscoop instance(s) (using the newscoop-backup script)
    2. Upgrade to Newscoop 3.2
    3. Run the newscoop-restore script with the -s or -c option (to learn more, run newscoop-restore --help). CAUTION: if you run the script with these options and your data was already encoded in UTF-8, your database will become unusable. You can always run the restore script again, either with different options (or without any) to restore your database.

IMPORTANT NOTE

If you moved your site through different servers with different configurations your data may be partially encoded in UTF-8 and partially in latin1 or other character set. Before encoding the database to UTF-8 check your site to see if part of the text show up correctly and part not. In this case you may need to encode only some tables to UTF-8. E.g.: only the Articles table, only the liveuser_users table etc. This decision is harder to make and needs expert advice. In this case at step 3 the first command would have to be modified this way:

mysqldump -u [user] -p --default-character-set=latin1 [dbname] \
  --tables [list_of_tables] > [dbname].sql

Note On Encoding

If in your database encoding was not latin1 replace it with your encoding in the commands above.

You can determine your encoding by querying for certain mysql server variables, like the following:

mysql> SHOW VARIABLES LIKE 'character_set%';

and

mysql> SHOW VARIABLES LIKE 'collation%';

or you can run other SHOW commands to see charset and collation info:

(for 'newscoop' database)
mysql> SHOW CREATE DATABASE newscoop;

(per table, in this case 'TopicFields' from 'newscoop' database)
mysql> SHOW CREATE TABLE newscoop.TopicFields;

Now the search should be case insensitive for any language.

  • No labels