SMS and the PDU format

Introduction

The SMS message, as specified by the Etsi organization (documents GSM 03.40 and GSM 03.38), can be up to 160 characters long, where each character is 7 bits according to the 7-bit default alphabet. Eight-bit messages (max 140 characters) are usually not viewable by the phones as text messages; instead they are used for data in e.g. smart messaging (images and ringing tones) and OTA provisioning of WAP settings. 16-bit messages (max 70 characters) are used for Unicode (UCS2) text messages, viewable by most phones. A 16-bit text message of class 0 will on some phones appear as a Flash SMS (aka blinking SMS or alert SMS).

The PDU format

There are two ways of sending and receiving SMS messages: by text mode and by PDU (protocol description unit) mode. The text mode (unavailable on some phones) is just an encoding of the bit stream represented by the PDU mode. Alphabets may differ and there are several encoding alternatives when displaying an SMS message. The most common options are "PCCP437", "PCDN", "8859-1", "IRA" and "GSM". These are all set by the at-command AT+CSCS, when you read the message in a computer application. If you read the message on your phone, the phone will choose a proper encoding. An application capable of reading incoming SMS messages, can thus use text mode or PDU mode. If text mode is used, the application is bound to (or limited by) the set of preset encoding options. In some cases, that's just not good enough. If PDU mode is used, any encoding can be implemented.

Receiving a message in the PDU mode

The PDU string contains not only the message, but also a lot of meta-information about the sender, his SMS service center, the time stamp etc. It is all in the form of hexa-decimal octets or decimal semi-octets. The following string is what I received on a Nokia 6110 when sending the message containing "hellohello" from www.mtn.co.za.

07

917283010010F5

040BC87238880900F10000993092516195800AE8329BFD4697D9EC37

This octet sequence consists of three parts: An initial octet indicating the length of the SMSC information ("07"), the SMSC information itself ("917283010010F5"), and the SMS_DELIVER part (specified by ETSI in GSM 03.40).

Note: on some phones (e.g. Ericssson 888?) the first three (colored) parts are omitted when showing the message in PDU mode!

All the octets above are hexa-decimal 8-bit octets, except the Service center number, the sender number and the timestamp; they are decimal semi-octets. The message part in the end of the PDU string consists of hexa-decimal 8-bit octets, but these octets represent 7-bit data (see below).

The semi-octets are decimal, and e.g. the sender number is obtained by performing internal swapping within the semi-octets from "72 38 88 09 00 F1" to "27 83 88 90 00 1F". The length of the phone number is odd, so a proper octet sequence cannot be formed by this number. This is the reason why the trailing F has been added. The time stamp, when parsed, equals "99 03 29 15 16 59 08", where the 6 first characters represent date, the following 6 represents time, and the last two represents time-zone related to GMT.

Interpreting 8-bit octets as 7-bit messages

This transformation is described in detail in GSM 03.38, and an example of the "hellohello" transformation is shown here. The transformation is based on the 7 bit default alphabet , but an application built on the PDU mode can use any character encoding.

Sending a message in the PDU mode

The following example shows how to send the message "hellohello" in the PDU mode from a Nokia 6110.

AT+CMGF=0 //Set PDU mode AT+CSMS=0 //Check if modem supports SMS commands AT+CMGS=23 //Send message, 23 octets (excluding the two initial zeros) >0011000B916407281553F80000AA0AE8329BFD4697D9EC37<ctrl-z> There are 23 octets in this message (46 'characters'). The first octet ("00") doesn't count, it is only an indicator of the length of the SMSC information supplied (0). The PDU string consists of the following: