

# A block coprocessor for user data rate improvements to GPRS coding scheme 4

Article

**Accepted Version** 

textual manuscript of published paper

Sherratt, R.S. ORCID: https://orcid.org/0000-0001-7899-4445, Zhang, K. and Wilkes, O.J. (2007) A block coprocessor for user data rate improvements to GPRS coding scheme 4. Journal of Circuits Systems and Computers, 16 (4). pp. 541-551. ISSN 0218-1266 doi:

https://doi.org/10.1142/S0218126607003848 Available at https://centaur.reading.ac.uk/15379/

It is advisable to refer to the publisher's version if you intend to cite from the work. See <u>Guidance on citing</u>.

To link to this article DOI: http://dx.doi.org/10.1142/S0218126607003848

Publisher statement: Electronic version of an article published as: Sherratt, R.S., Zhang, K. and Wilkes, O.J. (2007) A block coprocessor for user data rate improvements to GPRS coding scheme 4. Journal of Circuits Systems and Computers, 16 (4). pp. 541-551. ISSN 0218-126 [DOI:101142/S0218126607003848] © World Scientific Publishing Company [Journal URL http://www.worldscinet.com/jcsc/]

All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the <a href="End User Agreement">End User Agreement</a>.



# www.reading.ac.uk/centaur

## **CentAUR**

Central Archive at the University of Reading Reading's research outputs online

# Manuscript for Journal of Circuits, Systems, and Computers

## A Block Coprocessor for User Data Rate Improvements to GPRS Coding Scheme 4

R. Simon Sherratt<sup>1</sup>, Kai Zhang<sup>2</sup>, Owen J. Wilkes<sup>3</sup>

- <sup>1</sup> Signal Processing Laboratory, School of Systems Engineering, The University of Reading, RG6 6AY, UK {r.s.sherratt@reading.ac.uk}.
- <sup>2</sup> Signal Processing Laboratory, School of Systems Engineering, The University of Reading, RG6 6AY, UK when this work was completed and is now with the Beijing Kedong Power Control System Co. Ltd.
- <sup>3</sup> O. J. Wilkes was with the Signal Processing Laboratory, School of Systems Engineering, The University of Reading, RG6 6AY, UK when this work was completed and now is with Aculab.

#### **Contact details**

R. Simon Sherratt
Head of the Department of Electronic Engineering,
The University of Reading,
PO Box 225,
Reading, RG6 6AY
England

Tel: +44 (0) 118 3788588; Fax: +44 (0) 118 3788583

r.s.sherratt@reading.ac.uk

#### **Abstract**

The General Packet Radio Service (GPRS) has been developed to allow packet data to be transported efficiently over an existing circuit switched radio network, such as PCS or GSM. The main applications for GPRS are in transporting Internet Protocol (IP) datagram's from web servers (for telemetry) or the user's mobile Internet browsers to and from the Internet. Four GPRS baseband coding schemes are defined to offer a tradeoff in requested data rates versus propagation channel conditions. However, data rates in the order of >100kbits/second are only achievable if the simplest coding scheme is used (CS-4) which offers little error detection and correction (requiring excellent SNR) and the receiver hardware is capable of full duplex which is very expensive and not currently available in the consumer market.

A simple Error Detection and Correction (EDC) scheme to improve the GPRS Block Error Rate (BLER) performance is presented, particularly for coding scheme 4 (CS-4), however gains in other coding schemes are seen. For every GPRS radio block that is corrected by the EDC scheme, the block does not need to be retransmitted releasing bandwidth in the channel and improving the user's application data rate. As GPRS requires intensive processing in the baseband, a viable hardware solution for a GPRS BLER co-processor is discussed that has been currently implemented in a Field Programmable Gate Array (FPGA) and presented in this paper. A simple Error Detection and Correction (EDC) scheme to improve the GPRS Block Error Rate (BLER) performance is presented, particularly for coding scheme 4 (CS-4), however gains in other coding schemes are seen. For every GPRS radio block that is corrected by the EDC scheme, the block does not need to be retransmitted releasing bandwidth in the channel and

improving the user's application data rate. As GPRS requires intensive processing in the baseband, a viable hardware solution for a GPRS BLER co-processor is discussed that has been currently implemented in a Field Programmable Gate Array (FPGA) and presented in this paper.

Index Terms — GPRS, IP transport, co-processor, FPGA, application data rate.

#### 1. Introduction

The General Packet Radio Service (GPRS) [1, 2] has been developed to allow Internet Protocol (IP) datagram's to be transported efficiently over a circuit switched radio network backbone. Currently, the main use for GPRS is to enable users to have an 'always on' packet switched mobile Internet terminal interfaced to existing circuit switched radio networks, commonly Personal Communications Services (PCS) [3] and Global System for Mobiles (GSM) [4]. This paper will concentrate on the GSM system as it has the most international subscribers. Other applications for data telemetry over the mobile phone network are extremely common. It is worth noting that high speed solutions exist for data to be transported over a circuit switched connection [5], but these schemes hold the connection open and as such the user is expected to pay for each used slot and the call duration. Whereas, in GPRS the user pays for the amount of data transported (both schemes often require subscription).

Four GPRS baseband coding schemes are defined (CS-1 to CS-4) to offer a tradeoff in requested data rates versus propagation channel conditions. The marketing for GPRS was heavily biased towards users having available data rates up to 171kbits/second (over GSM). This is only achievable however, if the simplest coding scheme is used (CS-4) which in turn offers little error detection and no baseband correction (requiring excellent SNR) and that the receiver hardware is capable of full duplex (simultaneous transmit and receive operation) which in turn is very expensive and not currently available in the consumer market.

The addition of the GPRS service into a mobile terminal not only requires extra processing to handle the GPRS protocol stack and higher IP based applications including web browsers, but extra baseband functions are required to handle the different channel coding schemes, Uplink Status Flag (USF) detection and multi-slot operation (use of multiple slots in the same TDMA frame) if available.

This paper discusses the development of an Error Detection and Correction (EDC) co-processor that has been inserted in the receiver path between the final baseband processing layer and the Logical Link Control (LLC) protocol layer while still maintaining adherence to current standards. The EDC co-processor is capable, in its current form, of detecting and correcting for single bit errors in the GPRS radio block, having the objective to improve the system Block Error Rate (BLER) by correcting blocks that otherwise would have been deleted. An improvement in the BLER elevates the need for the request and the re-transmission of the block and offers a faster and more reliable GPRS service; hence the consumer is more satisfied with the GPRS enabled device(s).

#### 2. GPRS Baseband and Coding Schemes

As presented above, four GPRS baseband coding schemes are defined, CS-1, CS-2, CS-3 and CS-4 all with different characteristics [2] as depicted in Table I. It can be seen from table I that in order to achieve the high data throughput, CS-4 has no convolutional decoder with the only opportunity for error correction being in the channel equalizer. Each data block has either a 40-bit or a 16-bit Block Check Sequence (BCS) code (similar to a conventional CRC) appended to the data prior to baseband coding. This is so that at the receiver the validity of the received and decoded radio block can be checked and an indication of good/bad data block can be passed to Logical Link Control (LLC). LLC will then acknowledge the correct frame or request retransmission of the block based on the BCS. The result of the baseband coding is to always generate a block of 456 coded bits (defined as a radio block). The encoded radio block is split into 4 by 114 bits and carried using a GPRS traffic channel using 4 GSM normal bursts with 8 'stealing bits' for coding scheme indication [6], 2 bits per normal burst. The mapping of a single radio block to 4 TDMA slots is indicated in Fig. 1. For multi-slot operation, different slots in the same TDMA frame are used but for different radio frames.

If there is an error in the radio block indicated by the BCS failing, then the whole radio block is normally discarded and a request to retransmit the block is made. Therefore, a reliable data-transport is possible, if however, rather lossy. To this end, GPRS standards [7] define the acceptable Block Error Rate (BLER) (being  $\leq 10\%$ ), rather than Bit Error Rate (BER), over defined propagation channels at a certain noise level which varies for each coding scheme.

#### 3. Error Correction Scheme Compatible with Existing Standards

GPRS baseband channel decoding consumes a large proportion of the available processing in the Digital Signal Processor (DSP) as protocol decoding and applications software tends to reside in a microprocessor/microcontroller and the DSP tends to be used for channel decoding and time critical actions (a media DSP may also be present for video and enhanced audio coding). To alleviate the bandwidth issues in the baseband DSP, a co-processor can be defined for the receiver, whose function is to search through possible radio block decoded sequences to find a data-set which passes the BCS. If a block sequence is found that initially fails the BCS and due to the operation of the co-processor a corrected sequence is found that does then pass the BCS, an improvement in BLER is subsequently found. Thus the GPRS enabled device can either be quoted as having an improved reference sensitivity level, or a less expensive RF/DSP system can be implemented for the same level of performance, or that the user applications data rate will have improved. It must be noted that the search process must not cause the baseband processor to wait for a result as GPRS often requires a fast turn around between decoding and transmission.

#### 3.1. Definition of BCS Encoding and Decoding

In GPRS, the BCS is defined such that at the encoder the intended data sequence, d, is appended with a 40 (for CS-1) or 16 (for CS-2, CS-3 and CS-4) bit sequence, p, with the result that at the receiver the received data sequence may be divided by known divisor, g, resulting in a remainder, r, of all 1's in the division process irrespective of quotient, q. The failure to decode a remainder of all 1's indicates at least 1 bit in error is present in the received data sequence.

In practice, the encoding sequence needs to find p such that:-

$$\frac{\sqrt{4} < \sqrt{40 \text{ or } 16 + p}}{g} = q \text{ remainder } r \tag{1}$$

with known polynomial constants [2] for CS-1

$$g \bullet = \bullet^{23} + 1 * \bullet^{17} + D^3 + 1$$
 (2)

$$r \bullet = D^{39} + D^{38} + \dots + D^2 + D + 1 \tag{3}$$

or known polynomial constants [2] for CS-2, CS-3 and CS-4

$$g \bigcirc D^{16} + D^{12} + D^5 + 1 \tag{4}$$

$$r \bullet = D^{15} + D^{14} + \dots + D^2 + D + 1 \tag{5}$$

Calculations for the 40-bit or 16-bit scheme are similar and to reduce repetition in this text the 16-bit scheme will be discussed further. To calculate p, let

$$\frac{\sqrt{4} <<16}_{g} = q \text{ remainder } r1 \tag{6}$$

By definition it follows

$$d << 16 = q * g + r1 \tag{7}$$

Obviously

$$\Psi << 16 + p = q * g + r1 + p$$
 (8)

Since

$$r1 + p = r, \text{ then } p = r - r1 \tag{9}$$

Therefore in the encoder, to calculate p, perform the division of (d<<16) by g and assign p to be the logical inverse of the division remainder rI. The division can be implemented by classical CRC shift register format ensuring a simple decoding process.

In the BCS decoder, the received data only needs to be divided by g (using the shift registers) and the remainder, r, checked for all 1's.

#### 3.2. Definition of BCS EDC Algorithm

In GPRS the turn-around time between reception and transmission is very short as the received frame can indicate if the device has been allocated an uplink slot to use and this must not be missed. Therefore, any BCS Error Detection and Correction (EDC) algorithm must be simple and capable of fast processing.

The proposed EDC is very simple and gains its speed from parallel processing. As the Galois Field (GF(2)) is unique in this application, the algorithm searches for a sequence that passes the BCS by inverting a single bit in the decoded frame. This is achieved such that the  $n^{th}$  bit in the decoded radio frame (n=0..(224+40-1) for CS-1, n=0..(271+16-1) for CS-2, n=0..(315+16-1) for CS-3, n=0..(431+16-1) for CS-4) is inverted and the radio frame is passed through the EDC scheme defined above. After each EDC, the CRC remainder status is checked for a pass. Of course, by having multiple banks of CRC units, all the single-bit combinations can be checked at the same time resulting in no loss of turn-around time, even the result is partially calculated with the reception of each bit. The proposed EDC algorithm may be extended to more than 1 bit EDC, but the complexity of the EDC grows exponentially in the order of  $2^N$ .

#### 4. Simulation Model

To test the validity of the proposed co-processor, the GPRS system incorporating the proposed EDC scheme over a GSM baseband was used. The GSM baseband part of the simulation model used here has previously been described [8]. For this work, all the GPRS baseband coding and decoding functions have been added to the GSM model from the interface between the LLC layer to the interface with the GSM normal burst (entry point in previous GSM model). Thus all the encoding and decoding for the Block Check Sequence (BCS), tail, USF precoding, convolutional coding and interleaving have all been added for the 4 coding schemes. Also present in the model is the dynamic estimation of the transmitted coding scheme that selects which coding scheme decoder to use [9]. The result of the model is a full description of the GPRS baseband from the transmission of a LLC frame to the reception of the LLC frame and its decoded status.

#### **5. EDC Simulation Performance Results**

This section presents the results of the GPRS model in terms of tests defined in the standards to allow the model used here to be compared to the general literature. Then, with the inclusion of the EDC scheme, we present for the relevant propagation channels, the number of corrected radio blocks seen from the simulation model.

#### 5.1. System Model Performance under Reference Sensitivity with no EDC

Fig. 2 presents the base performance of the simulation model with the TU50 propagation channel [7]. By comparing to the standards (BLER≤10%) the model is conformant with margins of 2.0dB for CS-1, 2.0dB for CS-2, 0.3dB for CS-3 and 3.9dB for CS-4.

#### 5.2. System Model Performance under Co-channel Interference with no EDC

Fig. 3 presents the base performance of the simulation model with the TU50 propagation channel under co-channel interference [7] and by comparing to the standards (BLER≤10%) the model is conformant with margins of 0.7dB for CS-1, 1.8dB for CS-2, 0.8dB for CS-3 and 3.2dB for CS-4.

#### 5.3. Improved System Performance under Reference Sensitivity with EDC Present

The EDC scheme was introduced and the improvement in the system was measured for all of the relevant propagation channels. The improvement has been measured in terms of the number of corrected frames due to the EDC scheme over the reception of 500 GPRS frames.

Figs. 4-6 present the number of corrected frames due to the EDC scheme for the TU50, the HT100 and the RA250 propagation channels respectively. It can be seen that for all the channels, there is an improvement for CS-1, CS-2 and CS-3, but the improvement is only marginal and is

limited to very high noise conditions. This result is not surprising as CS-1, CS-2 and CS-3 all implement the convolutional decoder. However the number of corrected frames for CS-4 is astounding for all the propagation channels. The number of corrected frames is a function of SNR, but the peak correction lies between 62 and 70 frames over the reception of 500 frames. This directly equates to between 12% to 14% less required retransmissions than without the EDC and as the channel bandwidth is fixed, at least 12% to 14% improvement in the data rate seen by the user.

#### 5.4. Improved System Performance under Co-channel Interference with EDC Present

Figs. 7-9 present the number of corrected frames due to the EDC scheme for the TU3, the TU50 and the RA250 propagation channels with co-channel interference respectively. As above, the correction performance for CS-1, CS-2 and CS-3 exists but is extremely small compared to CS-4. The number of CS-4 corrections is dependant on C/Ic and has peak values of between 24 and 63 frames over the reception of 500 frames. This again directly equates to an improvement of at least 5% to 12% in the data rate seen by the user.

#### **6. Co-processor Implementation and Results**

Once the validity of the EDC scheme had been verified by simulation, the EDC scheme was implemented. The implementation in direct software was considered to be too processor intensive considering all the processing the baseband DSP was already computing ready to examine the USF so as not to miss its own uplink slot. It was decided to implement the EDC in a hardware solution as a co-processor to the baseband DSP processor. The actual EDC coprocessor was implemented using a Field Programmable Gate Array (FPGA) solution with a compatible parallel bus interface to the existing baseband DSP. The crude architecture of the coprocessor is presented in Fig. 10. The baseband DSP resets the co-processor and then informs the co-processor of the coding scheme as this has an impact on the length of the data and the BCS. The decoded data is then clocked into the co-processor, bit by bit as the bits are received, and the CRC units compute the current state. In this way there is little delay in the computation compared to buffering all the data in the co-processor before execution. The control unit manages the required bit inversion and passes the data to the parallel bank of individual CRC computation registers. After all the data has been clocked, the result from each CRC is directly available and the output unit examines the CRC's looking for a pass. The interface unit is informed of the BCS state and the result is available for the baseband processor to read.

Initially, a large FPGA was considered but this was far too expensive. An inexpensive 300k gate FPGA was used [10] where only 64 CRC units and the associated control logic could be synthesized as the full floor plan of Fig. 11 depicts. To remove the need to use a larger FPGA,

minor control logic was added to split the decoding into a maximum of 7 sections each computing a set of 64 CRC's. Thus in the final implementation the co-processor is called 7 times with an acceptable delay in GPRS, however designers may choose a tradeoff between the available number of gates and the acceptable latency. The compromise in the loss of speed (to an acceptable level) and slightly increased complexity in programming versus the reduced cost and size/power was seen as important for a consumer solution.

Further simplifications can be made if the EDC co-processor was only intended for CS-4, particularly as CS-1 requires banks of 40-bit CRC registers while CS-4 only requires 16-bit CRC registers.

#### 7. Conclusion

In an attempt to improve the confidence that users have in GPRS based consumer electronic devices, a solution to extend the capability of the Block Check Sequence (BCS) to incorporate Error Detection and Correction (EDC) is proposed that fits within the current standards. A simulation was constructed to test the performance of the proposed solution under standard propagation conditions and the improvement in the Block Error Rate (BLER) of the system with the co-processor incorporated was found. The improvement was particularly profound for CS-4 and this is encouraging as CS-4 offers the fastest transport that the users wish for. The result of these simulations also indicates that the proposed EDC is able to improve the download performance of the GPRS service to the consumer between 5% to 14%. To alleviate the baseband decoder processor bandwidth issues, the EDC co-processor is presented currently implemented in an FPGA and the architecture of the co-processor is discussed. This work has demonstrated that the co-processor using this architecture can address the correction of a single bit error in the decoded GPRS radio block. By implementing this co-processor in the GPRS service, an improvement to the overall received data throughput can be achieved in practice to benefit the consumers.

#### **Acknowledgment**

The contribution of DSP development hardware and host software is gratefully acknowledged from Texas Instruments on behalf of Robert Owen, University Programme Manager: Education & Communications of the Texas Instruments European University Programme.

#### **References**

- G. Bianchi, F. Borgonovo, A. Capone and L. Musumeci, "Packet data service over GSM networks: proposal and performance evaluation attempt", Proc. IEEE 5th Int. Symp. Personal, Indoor and Mobile Radio Communications, Vol. 3, pp. 929--933, 1994.
- 2. ETSI, GSM 05.03 version 8.5.0 Release 1999, European Standard, EN 300 909 V8.5.0 (2000-07).
- N. J. Boucher, The Cellular Radio Handbook: A Reference for Cellular System Operation, Wiley 2001, ISBN 0-471-38725-8
- V. K. Garg and J. E. Wilkes, Principles and Applications of GSM, Prentice Hall 1999, ISBN 0139491244.
- D. Zhou and M. Zukerman, "Performance and efficiency evaluation of channel allocation schemes for HSCSD in GSM", Proc. IEEE Global Telecomm. Conf., Vol. 1B, pp. 1084--1088, 1999.
- ETSI, GSM 05.02 version 8.5.0 Release 1999, European Standard, EN 300 908 V8.5.0 (2000-07).

- 7. ETSI, GSM 05.05 version 8.5.0 Release 1999, European Standard, EN 300 910 V8.5.0 (2000-07).
- 8. R. S. Sherratt, "Performance of GPRS Coding Scheme Detection under Severe Multipath and Co-Channel Interference as a Function of Soft-bit Width", Proc. IEEE Wireless Comm. and Networking Conf., Vol. 2, 16-20 March 2003, pp. 801--805
- 9. R. S. Sherratt and X. Yao, "Results of GPRS BLER as a Function of Coding Scheme Detection Performance and Signal Dynamic Range", Proc. IEEE Int. Symp. on Consumer Electronics, Sydney Australia, 3-5 December 2003, IEEE CD-ROM
- 10. Xilinx, Spartan-IIE 1.8v FPGA family, complete data sheet, Document DS077-4, 14<sup>th</sup> Feb 2003, www.xilinx.com

# Table captions

Table I, Tabulated coding scheme structure [2]

#### Figure captions

- Fig. 1. Mapping of coded radio blocks onto the slotted TDMA structure.
- Fig. 2. Base performance of used GPRS model for each of the 4 coding schemes under the TU50 reference sensitivity channel.
- Fig. 3. Base performance of used GPRS model for each of the 4 coding schemes under the TU50 co-channel interference channel.
- Fig. 4. Number of corrected GPRS frames for each of the 4 coding schemes under the TU50 reference sensitivity channel.
- Fig. 5. Number of corrected GPRS frames for each of the 4 coding schemes under the RA250 reference sensitivity channel.
- Fig. 6. Number of corrected GPRS frames for each of the 4 coding schemes under the HT100 reference sensitivity channel.
- Fig. 7. Number of corrected GPRS frames for each of the 4 coding schemes under the TU3 propagation path with co-channel interference.
- Fig. 8. Number of corrected GPRS frames for each of the 4 coding schemes under the TU50 propagation path with co-channel interference.
- Fig. 9. Number of corrected GPRS frames for each of the 4 coding schemes under the RA250 propagation path with co-channel interference.
- Fig. 10. Architecture of the EDC co-processor. The co-processor has a classical microprocessor bus interface.
- Fig. 11. Floor plan of 300k gate with the GPRS single bit EDC co-processor implemented (the square in lower right hand corner is the control hardware while everything else forms the CRC banks.)

|                              | CS-1 | CS-2   | CS-3   | CS-4 |
|------------------------------|------|--------|--------|------|
| Useful data portion of input | 181  | 268    | 312    | 428  |
| data block (excluding USF)   |      |        |        |      |
| Block Check Sequence         | 40   | 16     | 16     | 16   |
| length                       |      |        |        |      |
| USF length (bits)            | 3    | 6      | 6      | 12   |
| Added tail bits              | 4    | 4      | 4      | 0    |
| Convolutional coding         | 1/2  | 1/2    | 1/2    | none |
| Puncturing                   | none | 588 to | 676 to | none |
|                              |      | 456    | 456    |      |
| Overall coded radio block    | 456  | 456    | 456    | 456  |
| size (bits)                  |      |        |        |      |

Table I, Tabulated coding scheme structure [2]



Fig. 1. Mapping of coded radio blocks onto the slotted TDMA structure.



Fig. 2. Base performance of used GPRS model for each of the 4 coding schemes under the TU50 reference sensitivity channel.



Fig. 3. Base performance of used GPRS model for each of the 4 coding schemes under the TU50 co-channel interference channel.



Fig. 4. Number of corrected GPRS frames for each of the 4 coding schemes under the TU50 reference sensitivity channel.



Fig. 5. Number of corrected GPRS frames for each of the 4 coding schemes under the RA250 reference sensitivity channel.



Fig. 6. Number of corrected GPRS frames for each of the 4 coding schemes under the HT100 reference sensitivity channel.



Fig. 7. Number of corrected GPRS frames for each of the 4 coding schemes under the TU3 propagation path with co-channel interference.



Fig. 8. Number of corrected GPRS frames for each of the 4 coding schemes under the TU50 propagation path with co-channel interference.



Fig. 9. Number of corrected GPRS frames for each of the 4 coding schemes under the RA250 propagation path with co-channel interference.



Fig. 10. Architecture of the EDC co-processor. The co-processor has a classical microprocessor bus interface.



Fig. 11. Floor plan of 300k gate with the GPRS single bit EDC co-processor implemented (the square in lower right hand corner is the control hardware while everything else forms the CRC banks.)

{End of Manuscript}