Technote Details :: How XML Import-Export handles encoding
Issue
XML files must have an encoding defined. This is done on the XML header. Also, the database used as source for the information (or target for the import) use an encoding as well. The issue at hand is how do you know what to use in each to make them work.
Reason
When using the XML Import-Export extension to create XML files or import information into the database you must consider the way it is stored. Different data can use different formats which can lead to conflicts. There are two main cases for the encoding problem: on import and export.
Solution
For XML Import
When importing XML information into a database, the encoding used in the XML file is automatically detected. You do not need to supply this information. The encoding used in the database cannot be detected. Depending on what you are using, there are two situations:
If using the XML Import Wizard, you do not have any option in the user interface to define the encoding. Instead it is detected from the charset set for the page (in the <head> tag, the <meta charset= > section). Most likely the same encoding is used in the database as well (otherwise information will be displayed badly). If you do use a different encoding in the database, edit the server behavior that is added by the wizard. In the Advanced tab you can define the encoding method of choice.
If using directly the XML Import Transaction server behavior, you can set the desired database encoding in the Advanced tab.
If you do not have a different encoding in the database than in the page, you can safely use the automatically filled in values.
For XML Export
When creating an XML file from database information, you must specify both the database encoding and the encoding to use for the XML file. Again, there are two situations:
You are using PHP as server model, and have the mbstring extension loaded and enabled. If this is your case, the only restriction is the database encoding, which must be the correct one. You can specify any encoding you want for the output XML file, and the conversion will be performed automatically.
You are using ASP, ColdFusion or PHP without the mbstring extension. In this case you must specify the correct encoding for the database, and the same encoding for the XML file. If you use a different encoding for the XML file, you will not get an error, and information will be exported as is, without any conversion. The XML encoding you specify will be set for the file (in the XML header), but not used in any manner.
If you are in the second situation, it is advisable that you set the same encoding for both the XML file and the database.