Below code is able to read data source(following all reading rules), having text(with UTF-8 encodings of size one byte):
package main
import (
"fmt"
"io"
)
type MyStringData struct {
str string
readIndex int
}
func (myStringData *MyStringData) Read(p []byte) (n int, err error) {
// convert `str` string to slice of bytes
strBytes := []byte(myStringData.str)
// if `readIndex` is GTE source length, return `EOF` error
if myStringData.readIndex >= len(strBytes) {
return 0, io.EOF // `0` bytes read
}
// get next readable limit (exclusive)
nextReadLimit := myStringData.readIndex + len(p)
if nextReadLimit >= len(strBytes) {
nextReadLimit = len(strBytes)
err = io.EOF
}
// get next bytes to copy and set `n` to its length
nextBytes := strBytes[myStringData.readIndex:nextReadLimit]
n = len(nextBytes)
// copy all bytes of `nextBytes` into `p` slice
copy(p, nextBytes)
// increment `readIndex` to `nextReadLimit`
myStringData.readIndex = nextReadLimit
// return values
return
}
func main() {
// create data source
src := MyStringData{str: "Hello Amazing World!"} // 学中文
p := make([]byte, 3) // slice of length `3`
// read `src` until an error is returned
for {
// read `p` bytes from `src`
n, err := src.Read(p)
fmt.Printf("%d bytes read, data:%s
", n, p[:n])
// handle error
if err == io.EOF {
fmt.Println("--end-of-file--")
break
} else if err != nil {
fmt.Println("Oops! some error occured!", err)
break
}
}
}
Output:
$
$
$ go run src/../Main.go
3 bytes read, data:Hel
3 bytes read, data:lo
3 bytes read, data:Ama
3 bytes read, data:zin
3 bytes read, data:g W
3 bytes read, data:orl
2 bytes read, data:d!
--end-of-file--
$
$
But the above code is unable to read data source having text(with UTF-8 encodings of size greater than one byte) as shown below:
src := MyStringData{str: "Hello Amazing World!学中文"}
Below is the output:
$
$
$ go run src/../Main.go
3 bytes read, data:Hel
3 bytes read, data:lo
3 bytes read, data:Ama
3 bytes read, data:zin
3 bytes read, data:g W
3 bytes read, data:orl
3 bytes read, data:d!?
3 bytes read, data:???
3 bytes read, data:???
2 bytes read, data:??
--end-of-file--
$
$
Edit:
With the comments given on usage of strings.NewReader()
, below is the code modified:
// create data source
src := strings.NewReader("Hello Amazing World!学中文") // 学中文
// p := make([]byte, 3) // slice of length `3`
// read `src` until an error is returned
for {
// read `p` bytes from `src`
ch, n, err := src.ReadRune()
// n, err := src.Read(p)
fmt.Printf("%d bytes read, data:%c
", n, ch)
// handle error
if err == io.EOF {
fmt.Println("--end-of-file--")
break
} else if err != nil {
fmt.Println("Oops! some error occured!", err)
break
}
}
How to read unicode characters without splitting a character(say 学
) in two Read
calls?
See Question&Answers more detail:
os