A Novel Cryptosystem Based on Steganography and Automata Technique for Searchable Encryption

Truong, Nguyen Huy;

doi:10.3837/tiis.2020.05.022

KSII Transactions on Internet and Information Systems (TIIS)

Volume 14 Issue 5
/
Pages.2258-2274
/
2020
/
1976-7277(pISSN)
/
1976-7277(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

A Novel Cryptosystem Based on Steganography and Automata Technique for Searchable Encryption

Truong, Nguyen Huy (School of Applied Mathematics and Informatics, Hanoi University of Science and Technology)

Received : 2019.07.05
Accepted : 2020.04.01
Published : 2020.05.31

https://doi.org/10.3837/tiis.2020.05.022 Citation PDF KSCI HTML

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper we first propose a new cryptosystem based on our data hiding scheme (2,9,8) introduced in 2019 with high security, where encrypting and hiding are done at once, the ciphertext does not depend on the input image size as existing hybrid techniques of cryptography and steganography. We then exploit our automata approach presented in 2019 to design two algorithms for exact and approximate pattern matching on secret data encrypted by our cryptosystem. Theoretical analyses remark that these algorithms both have O(n) time complexity in the worst case, where for the approximate algorithm, we assume that it uses ⌈(1-ε)m)⌉ processors, where ε, m and n are the error of our string similarity measure and lengths of the pattern and secret data, respectively. In searchable encryption, our cryptosystem is used by users and our pattern matching algorithms are performed by cloud providers.

Keywords

1. Introduction

1.1 Background

Nowadays, with the rapid development of applications based on Internet infrastructure, cloud computing becomes one of the hottest topics in the information technology area. Indeed, it is a computing system based on Internet that provides on-demand services from application and system software, storage to processing data. For example, when cloud users use the storage service, they can upload information to the servers and then access it on the Internet online. Meanwhile, enterprises can not spend big money on maintaining and owning a system consisting of hardware and software. Although cloud computing brings many benefits for individuals and organizations, cloud security is still an open problem when cloud providers can abuse their information and cloud users lose control of it. Thus, guaranteeing privacy of tenants’ information without negating the benefits of cloud computing seems necessary [8, 11, 12, 13, 16, 32, 33].

In order to protect cloud users’ privacy, sensitive data need to be encoded before outsourcing them to servers. Unfortunately, encryption makes the servers perform search on ciphertext much more difficult than on plaintext. To solve this problem, many searchable encryption (SE) techniques have been presented since 2000. SE does not only store users’ encrypted data securely but also allows information search over ciphertext [7, 8, 9, 11, 12, 16, 19, 22, 32].

In cryptography, SE is a cryptosystem such that search can be done on encrypted data directly. SE can be either searchable symmetric encryption (SSE) or searchable asymmetric encryption (SAE). In SSE, only private key holders can create encrypted data and produce trapdoors for search. In SAE, users who have the public key can make ciphertexts but only private key holders can generate trapdoors [7, 8, 12, 16, 32].

Many SE methods pay attention to the problem of searching for pre-chosen keywords in ciphertext. For this problem, suppose that each data (document) contains a set of keywords. Then there are two approaches to SE. First way is to create an index which contains keywords and the corresponding document (forward index) or a keyword and the corresponding documents (inverted index). Second way is to do a sequential search without an index. Recently, to perform search more flexibly and keep away from wrong or no matching results, apart from traditional solutions only providing exact keyword search, the development of new methods supporting approximate (fuzzy) keyword search has been also studied [7, 8, 9, 11, 12, 13, 16, 19, 22, 25, 32].

However, the keyword based SE faces a problem. Keywords must be determined and also encrypted in a form, and then all files encrypted will be uploaded to the cloud. Then searching for new keywords can follow false results, even if the user data contains these keywords but not mentioned in the set of defined keywords. Furthermore, for the index base SE, very large indexes would make the efficiency of keyword search low [7, 12, 13, 19].

To deal with the above problem, there are some SE techniques proposed such as supporting file update functionality [9, 12, 16, 32], creating index file small [19] and providing pattern matching for search pattern is only asked at search time [7, 13, 25].

As we know that pattern matching is applied to search for information and analyze data every day, for example find and replace in text editing systems, in the search engine Google, database queries, searching on genomic data, etc [7, 26, 28].

Here, our work takes an interest in the problem of pattern matching on encrypted data, which is an important research direction in SE.

In spite of the considered problem’s importance, it has not been invested properly. To the best of our knowledge, there have only existed a few SE methods for exact pattern matching, but not for approximate pattern matching. Haynberg et al. [13] introduced SSE for exact pattern matching by using directed acyclic word graphs in the encryption algorithm (for more details about this data structure, see [3]). However, their technique needs the partial decryption of the ciphertext, it follows that the plaintext would be leaked to the attackers. Further, the searching is performed on users. Strizhov et al. [25] allowed pattern search on ciphertexts using the position heap tree data structure (see [10] for more details about the position heap). For this method, server does not perform search on encrypted data directly but only on index constructed from secret data. Desmoulins et al. [7] proposed SE for exact pattern matching, where the search phase Test is a pattern matching algorithm whose time complexity is O(mn) in the worst case for m, n are lengths of the pattern and the secret data.

The goal of this paper is to propose a novel symmetric cryptosystem that is used on users side, and algorithms for exact and approximate pattern matching on ciphertexts which are used on cloud servers side. These are essential components in SSE.

As we know that cryptography and data hiding are two branches of information security. Cryptography is used to distort data such that the data is not understood by attackers, it includes symmetric and asymmetric cryptography. While data hiding is used to hide data in digital media such as image, audio, video files, etc. It can be classified into steganography that protects secret data by hiding the existence of them and watermarking that prevents digital media by embedding watermarks in them [6, 34, 27].

Although cryptography and steganography are both capable of protecting secret data separately, different combinations of them are being developed to create systems with better security. The well studied hybrid technique of cryptography and steganography is to encode secret data using cryptography and then embed the ciphertext using steganography [2, 5, 30]. For gray images, Song et al. [23] introduced the first method which encrypts and embeds at once. However, since these methods must all guarantee acceptable imperceptibility of the digital media, the total number of secret data hidden is limited by the size of them. In other digital media formats, image steganography is used the most popularly because digital images are often transmitted on Internet and they have high degree of redundancy. Furthermore, the technique of image steganography is mainly image steganography in spatial domain [4, 14, 31]. So, to address the limitation of the existing hybrid methods, we propose a novel approach to construct a new cryptosystem based on spatial domain image steganography.

In our approach, we use the data hiding scheme (2,9,8) that is block based method in spatial domain, where 9 is the number of pixels in any image block, 8 is the number of secret bits which can be embedded in a block by changing colors of at most 2 pixels in the block. This scheme is near optimal for gray and palette images with high efficiency in embedding capacity, speed, security as well as visual quality, which are main properties of data hiding schemes [27]. Since our cryptosystem is designed to solve the problem of pattern matching on encrypted data and for an assumption that secret data is a string over the alphabet of size 256, the cryptosystem allows to encrypt letters of the secret data one by one.

For a given letter in the alphabet, corresponding to a 8-bit string, based on the embedding function in the data hiding scheme (2, 9, 8), we compute the information (called the flip information) to change the input image block. This flip information consists of positions of pixels in the block and way changing color of these pixels. The code word of the letter is a binary string presenting the flip information. The security analyses show that our cryptosystem provides high security with a key space of 2²⁰3⁹9!2⁹⁰2⁸! for gray images and 2²⁰3⁹9!2^18+9t2⁸! for palette images, where t is the number of bits representing color indexes.

Return to the remaining main objective in our work which is the problem of pattern matching on encrypted data on cloud providers side. In our results introduced [28,29], automata technique was applied to the problem of exact pattern matching and the longest common subsequence problem on plaintexts. With using this technique, we have achieved aims which are to design effective algorithms in practice to solve these problems. In this paper, we apply the algorithms to constructing exact and approximate pattern matching algorithms on ciphertexts performed by severs. Our main idea is to encrypt the automaton corresponding to a given pattern, and then we consider this encrypted automaton as a part of the trapdoor. Some theoretical analyses remarked that our algorithms all have O(n) time complexity in the worst case, where for approximate algorithm, we need an assumption that it uses [(1−ε)m] processors, where ε , m and n are error of the string similarity measure and lengths of the pattern and secret data, respectively.

1.2 Contributions

Our contributions in this paper can be summarized as follows.

1. We propose a novel approach to construct a cryptosystem based on steganography. The outstanding advantages of our cryptosystem are to allow encoding and decoding done at once, and ciphertexts that do not depend on the input image size as existing hybrid techniques of cryptography and steganography. In particular, this cryptosystem can be applied in SSE to encrypt and decrypt secret data by users.

2. We propose two sequential and parallel algorithms for exact and fuzzy pattern matching on secret data encrypted by our cryptosystem. These algorithms can be used by servers in SSE to perform pattern search. The outstanding feature of the algorithms is that, they can be applied sufficiently to all cryptosystems that support encrypting letters of the secret data one by one.

1.3 Organization

We organize the rest of this paper as follows. In Section 2, we recall some terminologies, definitions and results in [24, 27, 28, 29]. Section 3 proposes algorithms used by users and servers in SSE, we first construct a novel cryptosystem based on the data hiding scheme in [27], apply this cryptosystem to the process of encrypting and decrypting a secret data sequence and some security analyses are also discussed in Subsection 3.1. We then use the automata approach in our previous researches [28,29] to design exact and approximate pattern matching algorithms on secrete data encrypted by our cryptosystem in Subsection 3.2. Finally, Section 4 provides our conclusions.

2. Preliminaries

In this section, we will attempt to recall terminologies, definitions and results in [24, 27, 28, 29], which are really needed in order to present our new results clearly and logically, as well as help readers follow our paper’s content easily.

Now, we start with our near optimal data hiding scheme (2, 9, 8) proposed in [27], one of our data hiding schemes based on the Galois field GF(2²), constructed from the polynomial ring Z₂[x][24]. This scheme is a core material for constructing our new cryptosystem.

The data hiding scheme (2, 9, 8) is a five tuple \((\mathrm{I}, \mathrm{M}, \mathrm{K}, E m, E x)\), where \(\mathrm{I}\) is a set of all image blocks of 9 pixels with the same image format, \(\mathrm{M}=G F^{4}\left(2^{2}\right)\), \(\mathrm{K}\) is a finite set of secret keys of 9 elements of \(G F\left(2^{2}\right)\) Em and Ex are designed as follows [27].

Without loss of generality, we assume that for \(I \in \mathrm{I}\), \( K \in \mathrm{K}\), they can be given by

\(I=\left\{I_{1}, I_{2}, \ldots, I_{9}\right\}\),

where Ii is a color index in the palette for palette images or color value for gray images of the \(i^{t h}\) pixel in I, \(\forall i=\overline{1,9}\).

\(K=\left\{K_{1}, K_{2}, \ldots, K_{9}\right\}\)

with \(K_{i} \in G F\left(2^{2}\right)\), \(\forall i=\overline{1,9}\).

Given a secret element \(M \in \mathrm{M}\), an image block \(I \in \mathrm{I}\), a key \(K \in \mathrm{K}\). Let G be the flip graph for gray and palette images constructed as in [27] and the automaton AIM K (, , ) defined as in [27]. With q₀ is the initial state and δ is the state transition function of A(I, M, K), \(\text { Adjacent }\left(I_{i_{t}}, a_{t}\right)\) is an adjacent vertex of the vertex \(I_{i_{t}}\) in G and a_t is the weight of the arc \(\left(I_{i_{t}}, \text { Adjacent }\left(I_{i_{t}}, a_{t}\right)\right)\). Then we have [27]

The embedding function Em embedding M in I:

\(q=q_{0};\) (2.1)

\(\text { For } i=1 \text { to } 9 \text { Do } q=\delta\left(q, I_{i}\right);\) (2.2)

\(q=\delta(q, M);\) (2.3)

\(\text { For each }\left(i_{t}, a_{t}\right) \text { in } q \text { Do } I_{i_{t}}=\operatorname{Adjacent}\left(I_{i_{t}}, a_{t}\right);\) (2.4)

\(I^{\prime}=I;\) (2.5)

The extracting function Ex extracting M from I′ :

\(q=q_{0};\) (2.6)

\(\text { For } i=1 \text { to } 9 \text { Do } q=\delta\left(q, I_{i}^{\prime}\right);\) (2.7)

\(M=q ;\) (2.9)

Remember that the data hiding scheme (2,9,8) means we can hide a secret string of length 8 bits in an image block of 9 pixels with at most 2 pixels modified.

According to our construction of the data hiding scheme (2,9,8) and assume that we publish parameters Em, Ex, the vector space \(G F^{4}\left(2^{2}\right)\) and the flip graph G in this scheme. Then the security of the data hiding scheme (2,9,8) is given by the following formula [27]

\(c 3^{9} 9 ! 2^{18} 2^{8} !, \text { where } c \approx 2^{20}\). (2.10)

We then recall the components and properties of a cryptosystem in [24].

Definition 2.1 ([24]). A five tuple \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) is called a cryptosystem if the following properties hold.

1. \(\mathcal{P}\) is a finite set of plaintexts.

2. \(\mathcal{C}\) is a finite set of ciphertexts.

3. \(\mathcal{K}\) is a finite set of secret keys.

4. For \(\forall k \in \mathcal{K}\), there exists an encrypting function \(e_{k} \in \mathcal{E}\) and a corresponding decrypting function

\(d_{k} \in \mathcal{D}\), where \(e_{k}: \mathcal{P} \rightarrow \mathcal{C}\) and \(d_{k}: \mathcal{C} \rightarrow \mathcal{P}\) holds \(d_{k}\left(e_{k}(x)\right)=x\) with \(\forall x \in \mathcal{P}\).

Next, we present main terminologies and facts in [28] to design the exact pattern matching on encrypted data.

We call a finite set \(\Sigma\) an alphabet, denote the size of \(\Sigma\) by \(|\Sigma|\). Any element in \(\Sigma\) is called a letter. A string on \(\Sigma\) is a finite sequence of letters of \(\Sigma\). Denote the set of all strings over \(\Sigma\) by \(\Sigma^{*}\). The empty string is denoted by \(\varepsilon\). The length of the string x, denoted by ∣x∣, is the number of letters of x. For any string x of length n we can represent by

\(x=x[1] x[2] \ldots x[n], x[i] \in \Sigma, 1 \leq i \leq n\),

where n is a positive integer.

Denote the concatenation operator of the two strings u₁ and u₂ by u₁u₂.

A string p is called a substring of the string x, if x = u1pu2 for any strings u₁ and u₂. In case \(u_{1}=\varepsilon\left(\operatorname{resp} . u_{2}=\varepsilon\right)\), the string p is called a prefix (resp. suffix) of x. If p ≠ x, the prefix (resp. suffix) p is called proper.

We denote the i^th element of x by x[i] and i is called a position in x, the substring x[i]x[i+1]..x[j] of x by x[i..j] for ∀1 ≤ i ≤ j ≤ n. Let p be a substring of length m of x, where m is a positive integer, then there exists i for 1 ≤ i ≤ n − m + 1 such that p = x[i..i + m − 1]. We say that i is an occurrence of p in x or p occurs in x at position i.

Definition 2.2 ([28]). Given a string p and a letter a of p. Let i be some position in p for 1 ≤ i ≤ |p|. Then call i the last position of appearance of a in p, denoted by Pos_p(a) if a = p[i] and ∀j > i, j ≤ |p|, a ≠ p[j].

Definition 2.3 ([28]). Let p be a pattern of length m over the alphabet Σ. Then Next_p of p is a function such that Next_p : {1,...,m} → {0,...,m − 1} defined by Next_p(l) = max{|s| |s is both a suffix and proper prefix of p[1..l]} for l ∈ {1,...,m}.

Lemma 2.4 ([28]). Let p be a pattern, x be a text over the alphabet Σ and suppose that the degree of appearance of p in x at the position i is equal to l, 0 ≤ l ≤ |p|. Then the degree of appearance l’ at the position i+1 in x is given by the formula l’ = Appearance_p(l,a), where a = x_i+1, and the function Appearance_p corresponding to p is determined by

\(\text { Appearance }_{p}(l, a)=\left\{\begin{array}{cc} 0 & l=0, a \neq p[1] ; \text { or } a \notin p, \\ 1 & l=0, a=p[1], \\ l+1 & 0<l<|p|, a=p[l+1], \\ \text { Appearance }_{p}\left(\operatorname{Nex} t_{p}(l), a\right) & 0<l<|p|, a \neq p[l+1] ; \text { or } l=|p| . \end{array}\right.\)

Theorem 2.5 ([28]). Let p be a pattern of length m and \(A_{p}=\left(\Sigma, Q_{p}, q_{0}, \delta_{p}, F_{p}\right)\) corresponding to p be an automaton over the same Σ, where

• Q_p = {0,1,...,m} is a set of states,

• q₀ = 0 is the initial state,

• F_p = {m} is a set of final states,

• δ_p is the transition function satisfying δ_p : Q_p × Σ→ Q_p and δ_p(q,a) = Appearancep(q,a), where the function Appearancep corresponding to p as given in Lemma 2.4,

• To accept an input string, we can extend the transition function \(\delta_{p}: \delta_{p}: Q_{p} \times \Sigma^{*} \rightarrow Q_{p}\) such that \(\forall q \in Q_{p}, \delta_{p}(q, \varepsilon)=q, \forall s \in \Sigma^{*}, \forall a \in \Sigma, \delta_{p}(q, a s)=\delta_{p}\left(\delta_{p}(q, a), s\right)\).

Then the pattern p is accepted by the automaton A_p.

Finally, we recall important definitions in [29] to construct the approximate pattern matching on encrypted data.

Let LCS(p, x) be a longest common subsequence of p and x. Denote |LCS(p, x)| by lcs(p, x). We let the lcs(p, x) equal 0, if there does not exist any longest common subsequences of strings p and x (for more details about the concept “longest common subsequence of two strings,” see [29]).

We see that a subsequence u has at least a location in p. Note that u = p[j₁]p[j₂]..p[j_l] is a subsequence of p, then vector (j₁,j₂,..., j_l) is a location of u in p. We sort all the different locations of u into the dictionary order, then call the leftmost location of u the least element, denoted by LeftID(u). The last component in LeftID(u) is denoted by Rm_p(u) [29].

The symbol Config(p) is the set of all configurations of p. If C ∈ Config(p), then C can be the empty set, denoted by C₀, or C can be an ordered set {u₁,u₂,...,u_l} with 1 ≤ l ≤ |p|, where u_i is a subsequence of p, 1 ≤ i ≤ l (see more detail in [29]).

Definition 2.6 ([29]). Let p be a string of length m and C ∈ Config(p). Then the weight of C is a ordered set, denoted by W_p(C), is given by

(a) W_p(C), denoted by W₀, is the empty set if C = C₀.

(b) W_p(C) = {W_p(u₁),Wp(u₂),...,W_p(u_l)} if C = {u₁,u₂,...,u_l} for 1 ≤ l ≤ m, where the weight of u_i in p, denoted by W_p(u_i) and W_p(u_i) = |p| + 1 − Rm_p(u_i) for 1 ≤ i ≤ l.

Denote the set of all the weights of all the configurations of p by WConfig(p).

Definition 2.7 ([29]). Let p be a string of length m on the alphabet Σ and Σ_p be the set of all the letters of p. Then Ref_p of p is a function such that Ref_p : {1,...,m} × Σ_p → {1,...,m − 1} determined by

\(\operatorname{Ref}_{p}(i, a)=\left\{\begin{array}{cc} 0 & i=1, \\ \max \left\{W_{p}^{j}(a) \mid W_{p}^{j}(a)<i\right. \text { for } m+1-i<j \leq m & 2 \leq i \leq m, \end{array}\right.\)

where a ∈ Σ_p, where the weight of the letter a at the location j in p, \(W_{p}^{j}(a)=m+1-j \).

Notice that for an assumption that p contains a. With 1 ≤ i ≤ |p| if a ≠ p[i], we let \(W_{p}^{i}(a)=0\). At any location, the letter a has a weight. Denote the heaviest weight of a in p by Wm_p(a) [29].

Definition 2.8 ([29]). Let p be a string of length m over the alphabet Σ, W be a weight of a configuration of p and a ∈ Σ. Then a function δ_p is given by δ_p : WConfig(p) × Σ → WConfig(p) and

1. If a ∉ p, then δ_p(W,a) = W.

2. If a ∈ p, then δ_p(W0,a) = {Wm_p(a)}.

3. Assume that a ∈ p and W = {w₁,w₂,...,w_l} for 1 ≤ l ≤ m. Put W’ = δ_p(W,a). Then W’is computed by the following parallel algorithm:

(i) Put W’ = W;

Perform the block of the following commands in parallel way:

(ii) w’_l+1 = Ref_p(w_l a) if Ref_p(w_l,a) ≠ 0;

(iii) The following commands are executed in parallel: for ∀i ∈ {1, 2,..., l - 1}, w’_i+1 = Ref_p(w_i,a) if Ref_p(w_i,a) > w_i+1;

(iv) w’₁ = Wm_p(a) if Wm_p(a) > w₁;

4. To accept an input string, we extend the function δ_p: δ_p : WConfig(p) × Σ∗ → WConfig(p) such that ∀W ∈ WConfig(p), δ_p(W,ε) = W, ∀u ∈ Σ∗,∀a ∈ Σ, δ_p(W,au) = δ_p(δ_p(W,a),u).

3. Main Results

Subsection 3.1, we propose a novel cryptosystem based on our data hiding scheme (2,9,8) re-presented in Section 2 (Theorem 3.2 and Security analyses (3.3), (3.4)) and apply this cryptosystem to the process of encrypting and decrypting a secret data sequence (Proposition 3.4 and Security analyses (3.9), (3.10)). In Subsection 3.2, we use our automata approach recalled in Section 2 to design two algorithms for exact and approximate pattern matching on secret data encrypted by our cryptosystem proposed in Subsection 3.1 (Theorems 3.12 and 3.17).

3.1 A Novel Cryptosystem

Call Em’ to be a function which is derived from the function Em by removing two Statements (2.4) and (2.5). As in [27], the state q in Statement (2.3) is computed by q = δ(q, M) = δ₂(q, M), where q, M ∈ GF⁴ (2²) and

\(\delta_{2}(q, M)=\left\{\begin{array}{cc} \varnothing & \text { if } \mathrm{v}=\mathrm{q}, \\ \left(i_{t}, a_{t}\right) \mid 1 \leq i_{t} \leq 9, t=\overline{1, k^{\prime}}, k^{\prime} \leq 2, v_{i_{t}} \in S, a_{t} \in G F\left(2^{2}\right) \backslash\{0\}, M+(-q)=\sum_{i=1}^{k} a_{t} v_{i_{t}}\left(\text { on } G F^{4}\left(2^{2}\right)\right\} & \text { otherwise, } \end{array}\right.\)

where \(S=\left\{v_{1}, v_{2}, \ldots, v_{9}\right\}\) is a 2-Generators S for GF⁴(2²). Note that the number of S is given by [27]

\(c 3^{9} 9 ! \text { , where } c \approx 2^{20}\) (3.1)

Then it is easy to check that the function Em’ satisfies \(\mathrm{Em}^{\prime}:\mathrm{I}\times \mathrm{M} \times \mathrm{K}\rightarrow2^{\{1,2 \ldots, 9\} \times G F\left(2^{2}\right) \backslash\{0\}}\). Ex’ is a function obtained from Ex by replacing image blocks \(I_{i_{t}}\) with image blocks \(I_{i_{t}}^{\prime}\) in Statement (2.4) and then inserting two Statements (2.5) and (2.4) before Statement (2.6) in Ex, then the function Ex’ holds \(E x^{\prime}: 2^{\{1,2 \ldots, 9\}\times\left(k G F\left(2^{2}\right)\backslash\{0\}\right.} \times \mathrm{I} \times \mathrm{K} \rightarrow \mathrm{M}\). Since we have [27]

\(\forall(I, M, K) \in \mathrm{I} \times \mathrm{M} \times \mathrm{K}, E x(\operatorname{Em}(I, M, K), K)=M\)

and for our construction of two functions Em’ and Ex’ , similary, we also follow

\(\forall(I, M, K) \in \mathrm{I} \times \mathrm{M} \times \mathrm{K}, E x^{\prime}\left(E m^{\prime}(I, M, K), I, K\right)=M\). (3.2)

Remark 3.1. From defining two functions Em’ and Ex’ as above, all image blocks I used are not changed.

Consider Σ to be an alphabet of size 256. Set \(\mathcal{P}=\Sigma\).

In [27], \(\left(G F^{4}\left(2^{2}\right),+, \cdot\right)\) is considered a vector space over the field \(G F\left(2^{2}\right)\), where \(G F^{4}\left(2^{2}\right)=\left\{\left(x_{1}, x_{2}, x_{3}, x_{4}\right) \mid x_{i} \in G F\left(2^{2}\right), \forall i=\overline{1,4}\right\}\) with the vector addition and scalar multiplication given as follows.

\(x+y=\left(x_{1}+y_{1}, x_{2}+y_{2}, x_{3}+y_{3}, x_{4}+y_{4}\right)\),

\(a x=\left(a x_{1}, a x_{2}, a x_{3}, a x_{4}\right), a \in G F\left(2^{2}\right)\),

where \(x, y \in G F^{4}\left(2^{2}\right)\) and \(x=\left(x_{1}, x_{2}, x_{3}, x_{4}\right)\), \(y=\left(y_{1}, y_{2}, y_{3}, y_{4}\right)\). In addition, by the decimal representation of the vector space \(G F^{4}\left(2^{2}\right)\) over the field GF(2²), then \(|\mathcal{P}|=\left|G F^{4}\left(2^{2}\right)\right|=256\), hence there exists a bijective function f from \(\mathcal{P}\) to GF⁴(2²), denote the inverse function of f by f^-1. Put \(\mathcal{F}\) to be a set of all f.

From the function δ, the state q of the automaton A(I,M,K) computed by Statement (2.3) is a set. The state q may be one of the following sets: ∅, {(i, a)} for i ∈ {1,2,...,9}, a ∈ GF(2²)\{0} or {(i, a),(j, b)} for i, j ∈ {1,2,...,9}, a, b ∈ GF(2²)\{0}.

The index i ∈ {1,2,...,9} and the coefficient a ∈ GF(2²)\{0}) = {1,2,3} can be presented by binary strings of lengths 4 and 2, respectively. Hence, we can use 12 binary bits to present a state q. Suppose B is a binary string of length 12 to present an arbitrary state q, B = B₁₂ ...B₂B₁, then the storage structure of q in B is given as follows.

1. If q = ∅, then the value of any bit in B equals 0.

2. If q = {(i, a)}, then the values of 6 bits B₇,B₈,...,B₁₂ are 0; 6 remaining bits present (i, a), where 2-bit string B₂B₁ presents a, 4-bit string B₆B₅B₄B₃ presents i.

3. If q = {(i, a),(j, b)}, then the 6-bit string B₁₂B₁₁..B₇ presents (i, a) and the remaining 6-bit string B₆B₅..B₁ presents (j, b) in the above mentioned way.

Put Q to be a set of all possible states q, \(\mathcal{C}\) is a set of all 12-bit strings B presenting q, q ∈ Q. Consider a function h, \(h: Q \rightarrow \mathcal{C}, h(q)=B\), where q is presented by B. Obviously, h is a bijective function. Denote the inverse function of h by h⁻¹.

Let \(\mathcal{K}=\{(f, K, I) \mid f \in \mathcal{F}, K \in \mathrm{K}, I \in \mathrm{I}\}\) is a finite set of secret keys. For \(k \in \mathcal{K}\), k = (f, K, I), we define e_k and d_k as follows.

1. \(e_{k}: \mathcal{P} \rightarrow \mathcal{C}, e_{k}(x)=h\left(\operatorname{Em}^{\prime}(I, f(x), K)\right) \text { for } x \in \mathcal{P}\).

2. \(d_{k}: \mathcal{C} \rightarrow \mathcal{P}, d_{k}(y)=f^{-1}\left(E x^{\prime}\left(h^{-1}(y), I, K\right)\right) \text { for } y \in \mathcal{C}\).

Set \(\mathcal{E}=\left\{e_{k} \mid k \in \mathcal{K}\right\}\), \(\mathcal{D}=\left\{d_{k} \mid k \in \mathcal{K}\right\}\). Definition 2.1, the correctness of the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) is guaranteed by the following theorem.

Theorem 3.2. Let \(\forall x \in \mathrm{P}\), \(\forall k \in \mathcal{K}\), \(e_{k} \in \mathcal{E}\) and \(d_{k} \in \mathcal{D}\). Then d_k(e_k(x)) = x.

Proof. Set M = f(x),q = Em’ (I,M,K),B = h(q), then e_k(x) = B = y. We have h⁻¹(y) = h⁻¹(B) = q, Ex’ (q,I,K) = M by Formula (3.2), f⁻¹(M) = x, then d_k(y) = x. \(\square\)

Security analysis of the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\): Assume that we publish parameters the flip graph G, Em’, Ex’, GF⁴(2²) and h in the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\). The plaintext x is obtained from y by the Formula

\(x=d_{k}(y)=f^{-1}\left(E x^{\prime}\left(h^{-1}(y), I, K\right)\right)\).

So, to have accurately x, we need to know S and k = (f, K, I). The number of choices for the image block I is 256⁹ with gray images, 2^9t with palette images, where t is the number of bits to represent color indexes. Furthermore, by Formula (2.9), the security of the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) is given by the following formula

\(c 3^{9} 9 ! 2^{18} 2^{8} ! 256^{9}=c 3^{9} 9 ! 2^{90} 2^{8} ! \) for gray images, (3.3)

\(c 3^{9} 9 ! 2^{18} 2^{9 t} 2^{8} !=c 3^{9} 9 ! 2^{18+9 t} 2^{8} !\) for palette images. (3.4)

Remark 3.3. By Remark 3.1, all pairs of functions (e_k, d_k) in the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) do not make the image blocks I change for \(\forall k \in \mathcal{K}\), \(k=(f, K, I)\). In addition, we can see that encrypting and hiding are done at the same time.

Consider an arbitrary subset of image blocks F as an input image, \(F \subset \mathrm{I}\), F = {F₁,F₂,...,F_t2}, t₂ is the number of image blocks. Next, we give a way applying the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) to the process of encrypting and decrypting secret data over an insecure channel. By Remark 3.3, we can use a secret key subset H instead of one secret key k,

\(H=\{(f, K, I) \mid K \in \mathrm{K}, I \in F\} \subset \mathcal{K}\)

for \(f \in \mathcal{F}, \mathrm{K}=\left\{K^{1}, K^{2}, \ldots, K^{t}\right\}\).

Suppose that secret data is a string x = x₁x₂..x_t3 for \(x_{i} \in \mathcal{P}\), \(\forall i=\overline{1, t_{3}}\), \(t_{3} \geq 1\). The encrypting algorithm e_H used to encrypt x is given as follows.

i_K = 1; i_F = 1;

For i = 1 to t₃ Do

{

\(\qquad k_{i}=\left(f, K^{i K}, F_{i F}\right);\) (3.5)

\(\qquad y_{i}=e_{k i}\left(x_{i}\right);\) (3.6)

\(\qquad \begin{array}{l} i_{K}=\left(i_{K}-1\right) \bmod t_{1}+1 ; \\ i_{F}=\left(i_{F}-1\right) \bmod t_{2}+1 ; \end{array}\)

}

y = y₁y₂ ..y_t3;

The decrypting algorithm dH used to decrypt y is given as follows.

i_K = 1; i_F = 1;

For i = 1 to t₃ Do

{

\(\qquad k_{i}=\left(f, K^{i K}, F_{i F}\right);\) (3.7)

\(\qquad x_{i}^{\prime}=d_{k i}\left(y_{i}\right) ;\) (3.8)

\(\qquad \begin{array}{l} i_{K}=\left(i_{K}-1\right) \bmod t_{1}+1; \\ i_{F}=\left(i_{F}-1\right) \text { mod } t_{2}+1; \end{array}\)

}

x’ = x’₁x’₂ ..x’_t3;

Propostion 3.4. Let F, x, H and the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) based on the data hiding (2,9,8) as above. Then d_H(e_H(x)) = x.

Proof. Clearly, \(\forall i=\overline{1, t_{3}}\), k_i determined in Statement (3.5) is the same as in Statement (3.7). In addition, by Theorem 3.2, x_i is encrypted by Statement (3.6) and obtained by Statement (3.8) such that x’_i = x_i. Then x = x’, hence d_H(e_H(x)) = x. \(\square\)

Security analysis of process of encrypting and decrypting the secret data x using two algorithms e_H and d_H: Assume that we also publish parameters as in the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\). Hence, to restore exactly x, we need to know S and H. The number of choices for S is \(c 3^{9} 9 !\) by Formula (3.1). The number of choices for H is \(2^{8} ! 2^{18 . t 1} 256^{9 . t 2}\) (for gray images), \(2^{8} ! 2^{18 . t 1} 2^{9 . t . t 2}\) (for palette images, where t is the number of bits to represent color indexes). Then for a brute force attack, the number of all possible combinations of S and H used in two algorithms e_H and d_H is

\(c 3^{9} 9 ! 2^{8} ! 2^{18 . t 1} 256^{9 . t 2}=c 3^{9} 9 ! 2^{8} ! 2^{18 . t 1+72 . t 2}\) for gray images, (3.9)

\(c 3^{9} 9 ! 2^{8} ! 2^{18 t 1} 2^{9 . t . t 2}=c 3^{9} 9 ! 2^{8} ! 2^{18 . t 1+9 . t . t 2}\) for palette images. (3.10)

Remark 3.5. For two algorithms e_H and d_H given as above, an arbitrary image block I in the input image F can be used many times in process of encrypting and decrypting the secret data. So, for a give input image F, the secret data encrypted is not limited by the size of the input image F.

3.2 Automata Technique for Pattern Matching on Encrypted Data

Suppose that Alice has a secret data and prefers to outsource this data to a cloud provider Bob. As the server is semi-trusted, Alice needs to encrypted her plaintext and wishes to only store ciphertext in the cloud. Assume that Alice uses the cryptosystem \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) proposed in Subsection 3.1 to encrypt data with a pair of two secret parameters (S, k) in the cryptosystem, where S is a 2-Generators for GFⁿ(p^m), |S| = 9 and \(k=(f, K, I) \in \mathcal{K}\).

Because of limited storage space and computing ability, instead of downloading ciphertext, decrypting it and searching locally, Alice may ask Bob to perform pattern matching tasks on the ciphertext directly with a trapdoor of the pattern received from her.

To be able to support pattern matching on the server side without leaking information in plaintext, bellow we will construct pattern matching algorithms which can search for any pattern directly in the ciphertext.

Consider Σ to be an alphabet of size 256. Suppose that the secret data is a string over Σ

x = x₁x₂..x_t3

for \(x_{i} \in \mathcal{P}\), \(\forall i=\overline{1, t_{3}}, t_{3} \geq 1\) and t₃ is often a large natural number, where \(\mathcal{P}=\Sigma\).

Before uploading the secret data x to Bob, Alice use the encrypting function \(e_{k} \in \mathcal{E}\) to encrypt each x_i. Then Alice computes y_i = e_k(x_i), \(\forall i=\overline{1, t_{3}}\), and the encrypted secret data is a string over Σ’

y = y₁y₂..y_t3

which is sent to Bob, where Σ’ is an alphabet

Σ’ = {a’ | a’ = e_k(a),a ∈ Σ}.

In general case, for x is any string over the alphabet Σ and a string y is obtained from x by the above way. Then we can write y = e_k(x) for short and y is a string over the alphabet Σ’.

Remark 3.6. By using only one pair of two secret parameters (S, k), then the security of process of encrypting and decrypting the secret data x is similar to Formulas (3.3) (for gray images) or (3.4) (for palette images).

Suppose that Bob needs to perform exact or approximate pattern matching tasks of an arbitrary pattern p on encrypted data y. Based on our previously introduced results in [28,29], we continue using automata technique to meet the requirements.

We first introduce some theoretical results to follow the exact pattern matching.

Propostion 3.7. Let p be a pattern over the alphabet Σ. Then Posp’(a’) = Pos_p(a) for ∀a’ ∈ Σ’, a = d_k(a’), where p’ = e_k(p).

Proof. Set i = Pos_p(a), then a = p_i, hence a’ = p_i’ . Without loss of generality, suppose Pos_p’(a’) > i, then ∃i' > i, p’_i' = a’ by Definition 2.2, then a = p_i’ = d_k(p’_i’). Then i < Pos_p(a), a contradiction. So, we complete the proof. \(\square\)

Propostion 3.8. Let a pattern p and a text x be two strings over the same alphabet Σ and the function Sign be given by

\(\forall a^{\prime} \in \Sigma^{\prime}, \operatorname{Sign}\left(a^{\prime}\right)=\left\{\begin{array}{ll} 1 & \text { If } a \in p, \\ 0 & \text { Otherwise.} \end{array}\right.\)

Then ∀a’ ∈ Σ’ , a’ ∈ p’ if and only if Sign(a’) = 1.

Proof. Suppose ∀a’ ∈ Σ’ , a’ ∈ p’ if and only if ∃i, i = 1..|p’|, a’ = p’_i if and only if a = p_i if and only if Sign(a’) = 1. \(\square\)

Propostion 3.9. Let a pattern p and a text x be two strings over the same alphabet Σ. Then p occurs at any position i in x if and only if p’ occurs at the position i in y, where y = e_k(x).

Proof. Suppose that p occurs at any position i in x if and only if p = x_ix_i+1..x_i+|p|−1 if and only if y_iy_i+1..y_i+|p|-1 = p’ if and only if p’ occurs at the position i in y. \(\square\)

Propostion 3.10. Let p be a pattern over the alphabet Σ. Then ∀l,1 ≤ l ≤ |p|, Next_p’(l) = Next_p(l), where p’ = e_k(p).

Proof. Without loss of generality, suppose that lm = Next_p(l) < Nextp’(l) for ∀l,1 ≤ l ≤ |p|. Since p’_i = e_k(p_i),∀i = 1..|p|, then \(p_{1}^{\prime} p_{2}^{\prime} \cdot \cdot p_{l m+1}^{\prime}\) is both a proper suffix and prefix of p’[1..l] by Definition 2.3. Hence, p₁p₂..p_lm+1 is also both a proper suffix and prefix of p[1..l] by Definition 2.3. Then Next_p(l) > lm. This is a contradiction to our supposition. So, the proof is complete. \(\square\)

Propostion 3.11. Let p be a pattern over the alphabet Σ. Then for ∀l,0 ≤ l ≤ |p| and ∀a’ ∈ Σ’, a = d_k(a’), Appearance_p’(l,a’) = Appearance_p(l,a), where p’ = e_k(p).

Proof. Clearly, |p| = |p’| and for ∀i,1 ≤ i ≤ |p’|,∀a’, a’ ∈ Σ’, a’ = p’_i if and only if a = p_i. By Lemma 2.4 and Proposition 3.10, Appearance_p’(l,a’) = Appearance_p(l,a). \(\square\)

Theorem 3.12. Let p be a pattern over the alphabet Σ. Let two automata A_p = (Σ,Q_p,q₀,δ_p,F_p) and A_p’ = ( Σ, Q_p’, q₀, δ_p’, F_p’) be determined as in Theorem 2.5. Then Q_p’ = Q_p, F_p’ = F_p,∀q ∈ Q_p’, ∀a’ ∈ Σ, a = d_k(a’), δ_p’(q,a’) = δ_p(q,a), where p’ = e_k(p).

Proof. It is easy to verify that |p| = |p’|. In addition, by Theorem 2.5 and Proposition 3.11, then Q_p’ = Q_p, F_p’ = F_p and δ_p’(q,a’) = δ_p(q,a). \(\square\)

Remark 3.13. The meaning of Theorem 3.12 in practice is to compute δ_p’ from δ_p.

Let a pattern p and a text (secret data) x be two strings over the same alphabet Σ and assume |p|= |x|. For assuming that we have only the encrypted secret data y which is not decrypted to the secret data x, from Propositions 3.7, 3.8 and 3.9, Theorem 3.12, based on the MR_c algorithm for c = 1 and using the type a breaking point and the concept of Pos in [28], and by using the automaton A_p’ given as in Theorem 2.5, we have an exact pattern matching algorithm immediately that finds all occurrences of the pattern p in x as follows. Note that the trapdoor according to the search pattern p is computed based on p, which includes the length of p, the functions Sign, Pos_p’ and the automaton A_p’.

\(\text { jump }=|p|;\\ \text { While }(j u m p \leq|y|)\\ \{\\ \qquad \quad \text { If }\left(\operatorname{sign}\left(y_{j u m p}\right)==1\right)\\ \qquad \quad\{\\ \qquad \qquad q=0;\\ \qquad \qquad i=\operatorname{jum} p-\operatorname{Pos}_{p'}\left(y_{j u m p}\right)+\\ \qquad \qquad 1;\text{Do}\\ \qquad \qquad \{\\ \qquad \qquad \qquad q=\delta_{p}\left(q, y_{i}\right);\\ \qquad \qquad \qquad \text { If }(q==|p| \text { ) Mark an occurrence of } p \text { at } i-|p|+1 \text { in } x \text { ; }\\ \qquad \qquad \qquad i++;\\ \qquad \qquad \text { \} While }(q \neq 0 \text { and } i \leq|y|) \text { ; }\\ \qquad \qquad \operatorname{jump}=i-1 ;\\ \qquad \quad \}\\ \qquad \quad \operatorname{jum} p=\text { jump }+|p| ;\\ \}\)

Remark 3.14. Obviously, the time complexity of the above algorithm is the same as our MR1 algorithm in the worst case, O(n) [28]. Then in the worst case, our new algorithm’s time complexity is also O(n).

Next, theoretical results for approximate pattern matching are shown as follows.

Propostion 3.15. Let p be a pattern over the alphabet Σ. Then WConfig(p’) = WConfig(p), where p’ = e_k(p).

Proof. Obviously, W₀ ∈ WConfig(p’) and WConfig(p). Consider ∀W’ ∈ WConfig(p’)\{W₀}, then we can set \(\mathrm{W}^{\prime}=\left\{\mathrm{w}_{1}^{\prime}, \mathrm{w}_{2}^{\prime}, \ldots, \mathrm{w}_{l}^{\prime}\right\}\) for 1 ≤ l ≤ |p’|. Then ∃C’ = {u’₁,u’₂,...,u’_l} ∈ Config(p’) by Definition 2.6, where W_p’(u’_i) = w_i’ for 1 ≤ i ≤ l. Then ∃!C = {u₁,u₂,...,u_l} ∈ Config(p), u_i = d_k(u’_i) for 1 ≤ i ≤ l. Set W = W_p(C), then W ∈ WConfig(p) by Definition 2.6. It easy to verify that Rm_p’(u’_i) = Rm_p(u_i) for 1 ≤ i ≤ l, then W_p’(u_i’) = W_p(u_i) for 1 ≤ i ≤ l by Definition 2.6. Hence, W’ = W, then WConfig(p’) ⊂ WConfig(p). Similarly, we have WConfig(p) ⊂ WConfig(p’). So, the proof is complete. \(\square\)

Propostion 3.16. Let p be a pattern over the alphabet Σ. Then Ref_p’(i,a’) = Ref_p(i,a) for ∀i, 0 ≤ i ≤ |p’| and ∀a’ ∈ Σ’, a = d_k(a’), where p’ = e_k(p).

Proof. Clearly, \(W_{p'}^{i}\left(a^{\prime}\right)=W_{p}^{i}(a)\) by Definition 2.7. So, Ref_p’(i,a’) = Ref_p(i,a) by Definition 2.7. Hence, we complete the proof. \(\square\)

Theorem 3.17. Given a pattern p on Σ and a positive integer constant c with 1 ≤ c ≤ |p|. Let two automata \(A_{p}^{P c}=\left(\Sigma, Q_{p}, q_{0,} \delta_{p}, F_{p}\right)\) and \(A_{p^{\prime}}^{P c}=\left(\Sigma^{\prime}, Q_{p^{\prime}}, q_{0}, \delta_{p'}, F_{p'}\right)\) be determined as in Theorem 39 [29]. Then Q_p’ = Q_p, F_p’ = F_p, ∀q ∈ Q_p’, ∀a’ ∈ Σ’, a = d_k(a’), δ_p’(q,a’) = δ_p(q,a), where p’ = e_k(p).

Proof. By Proposition 3.15, Q_p’ = Q_p. Evidently, ∀a’ ∈ Σ’_’, a = d_k(a’), a’ ∈ p’ if and only if a ∈ p. Furthermore, by Definition 2.8 and Proposition 3.16, δ_p’(W,a’) = δ_p(W,a). Then F_p’ = F_p. So, the proof is complete. \(\square\)

Remark 3.18. The meaning of Theorem 3.17 in practice is to compute δ_p’ from δ_p.

Based on the approximate pattern matching problems considered in [17, 18, 20], we introduce a new concept of the appearance of the pattern p in x with a given error. This is a basis for giving requirements for the approximate pattern matching algorithm.

Definition 3.19. Given two strings p and x over Σ, and a string similarity measure d. Let an error ε,ε 0, ε > ∈ R . Then p appears in x with the error if there exists a substring u of x such that d(p, u) ≤ ε.

To construct the approximate pattern matching algorithm, we need a function to measure the string similarity. The most commonly used similarities are recalled in [19, 20, 21]. Bakkelund [1] proposed a well known string similarity measure which is based on the longest commonly subsequence. Similarly, here we define a new measure of similarity between two strings

\(d(p, u)=1-\frac{\operatorname{lcs}(p, u)}{\min \{|p|,|u|\}}\), (3.11)

where p is a pattern and u is a substring of x. Clearly, d given above is positive definite and symmetric.

Propostion 3.20. Given two strings p and x on Σ. Then ∀u’ , u’ is an arbitrary substring of y, d(p’, u’) = d(p, u), where p’ = e_k(p),y = e_k(x),u = d_k(u’).

Proof. Clearly, |p’| = |p|, |u’| = |u| and lcs(p,u) = lcs(p’,u’). By Formula (3.11), d(p’ , u’) = d(p, u). So, we complete the proof. \(\square\)

By using the string similarity measure given in Formula (3.11), the automata technique for computing lcs(p’, u’) [29] will make an approximate pattern matching algorithm fast, and especially efficient for one pattern and a set of a large number of encrypted texts.

Given a pattern p and a text (secret data) x over the same alphabet Σ, and an arbitrary substring u of x. Let ε, 0 < ε < 1 and d(p, u) be given as in Formula (3.11) such that d(p, u) ≤ ε. Then by Proposition 3.20, d(p, u) ≤ ε . By Formula (3.11), we have

\(\operatorname{lcs}\left(p^{\prime}, u^{\prime}\right) \geq(1-\varepsilon) \min \left\{\left|p^{\prime}\right|,\left|u^{\prime}\right|\right\}\). (3.12)

If there is u’ which is a substring of y such that lcs(p', u') ≥ (1-ε) |p|, then Formula (3.12) holds that means d(p', u') ≤ ε. Hence, ∃u, u is a substring of x, d(p, u) ≤ ε. So, the constant c in Theorem 39 [29] is determined by \(c=\lceil(1-\varepsilon)|p|\rceil\).

Without decrypting y, based on Theorem 3.17, Definition 3.19 and Formula (3.11), use the automaton \(A_{p^{\prime}}^{P c}\) given as in Theorem 39 [29], we immediately have an approximate pattern matching algorithm which determines whether p appears in x with the error ε or not as follows. Here, the trapdoor responding to the pattern p is determined from p and ε , which consists of the constant c and the automaton \(A_{p^{\prime}}^{P c}\).

\(a p p=0;\\ q=W_{0} ; / / \text { The initial state of the automaton } A_{p^{\prime}}^{P_{c}} \text { is started from } W_{0}\\ \text { For } i=1 \text { to }|y| \text { Do }\\ \{\\ \qquad q=\delta_{p'}\left(q, y_{i}\right);\\ \qquad \text { If }(|q|=c)\{\text { app }=1 ; \text { Break; }\}\\ \}\\ \text { If }(a p p=1) \text { Announce the appearance of the pattern } p \text { in } x \text { with the }\ \text { error; }\\ \text { Else Announce that } p \text { does not appear in } x \text { with the error } \varepsilon \text { . }\)

Remark 3.21. Since we can compute δ_p’ from δ_p, our proposed algorithm is similar to the Algorithm 2 (the parallel algorithm) in [29]. In addition, according to Theorem 39 [29], δ_p is computed in parallel way and the Algorithm 2 costs the worst case time complexity O(n) with the supposition that the Algorithm 2 uses k processors for k is an upper estimate of the lcs(p,x). As an immediately consequence, in the worst case, we have the O(n) time complexity of the above algorithm when it uses \(\lceil(1-\varepsilon)|p|\rceil\) processors.

4. Conclusions

From our results in the steganography and pattern matching areas and some suggestions in the next works in [27, 28, 29], this paper has completed some parts of those works. Based on the data hiding scheme (2, 9, 8) in [27], we construct a novel cryptosystem with high security. This method allows both of encrypting and hiding to be done at once, the ciphertext not to depend on the input image size as existing hybrid techniques of cryptography and steganography. Next, we use this cryptosystem to encrypt secret data on users side. With the ciphertext, we design two pattern matching algorithms to search for any pattern in it directly on cloud servers side. The idea of the design is to apply our automata approach for the exact pattern matching and the longest common subsequence problems in [28,29]. For the assumption that the approximate algorithm uses \(\lceil(1-\varepsilon) m)\rceil\) processors, the time complexities of these algorithms are both O(n) in the worst case, where ε , m and n are the error of our measure of similarity between two strings and lengths of the pattern and secret data, respectively.

With our automata approach to pattern matching algorithms, the automata constructed are only based on search patterns. Then the algorithms will have lots of advantages in case of a given pattern and a very large set of ciphertexts stored in the cloud. So, in the future, we continue studying this technique to apply in SE.

Acknowledgements

The author is truly grateful to Phan Trung Huy, Phan Thi Ha Duong and Vu Thanh Nam for their valuable suggestions and help. This work was partially funded by the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under the grant number 101.99-2016.16.

References

D. Bakkelund, "An LCS-based String Metric," University of Oslo (Norway), September 23, 2009.
P. Bharti, R. Soni, "A New Approach of Data Hiding in Images Using Cryptography and Steganography," International Journal of Computer Applications, 58(18), pp. 1-5, 2012. https://doi.org/10.5120/9379-3716
A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. T. Chen, J. Seiferas, "The Smallest Automation Recognizing The Subwords of A Text," Theoretical Computer Science, Volume 40, pp. 31-55, 1985.
S. Chakraborty, S. K. Bandyopadhyay, "Steganography Method Based on Data Embedding by Sudoku Solution Matrix," International Journal of Engineering Science Invention, 2(7), pp. 36-42, 2013.
A. Chatterjee, A.K. Das, "Secret Communication Combining Cryptography and Steganography," Progress in Advanced Computing and Intelligent Engineering, Vol. 563, pp. 281-291, 2018. https://doi.org/10.1007/978-981-10-6872-0_26
G. Chugh, "Information Hiding - Steganography & Watermarking: A Comparative Study," International Journal of Advanced Research in Computer Science, 4(4), pp. 165-171, 2013.
N. Desmoulins, P. A. Fouque, C. Onete, O. Sanders, "Pattern Matching on Encrypted Streams," Advances in Cryptology - ASIACRYPT 2018, pp. 121-148, 2018.
Q. Dong, Z. Guan, L. Wu, Z. Chen, "Fuzzy Keyword Search over Encrypted Data in The Public Key Setting," Web-Age Information Management, pp. 729-740, 2013.
R. Dowsley, A. Michalas, M. Nagel, N. Paladi, "A Survey on Design and Implementation of Protected Searchable Data in The Cloud," Computer Science Review, Volume 26, pp. 17-30, 2017. https://doi.org/10.1016/j.cosrev.2017.08.001
A. Ehrenfeucht, R, M. McConnell, N. Osheim, S. W. Woo, "Position Heaps: A Simple and Dynamic Text Indexing Data Structure," Journal of Discrete Algorithms, Vol. 9, pp. 100-121, 2011. https://doi.org/10.1016/j.jda.2010.12.001
Y. K. Gedam, J.N. Varshapriya, "Fuzzy Keyword Search over Encrypted Data in Cloud Computing," Journal of Engineering Research and Applications, 4(7), pp. 197-202, 2014.
F. Han, J. Qin, J. Hu, "Secure Searches in The Cloud: A Survey," Future Generation Computer Systems, Vol. 62, pp. 66-75, 2016. https://doi.org/10.1016/j.future.2016.01.007
R. Haynberg, J. Rill, D. Achenbach, J. Muller-Quade, "Symmetric Searchable Encryption for Exact Pattern Matching Using Directed Acyclic Word Graphs," in Proc. of 2013 International Conference on Security and Cryptography (SECRYPT), pp. 403-410, 2013.
M. Jain, S. K. Lenka, "A Review of Digital Image Steganography Using LSB and LSB Array," International Journal of Applied Engineering Research, 11(3), pp. 1820-1824, 2016.
N. S. Jho, D. Hong, "Symmetric Searchable Encryption with Efficient Conjunctive Keyword Search," KSII Transactions on Internet and Information Systems, 7(5), pp. 1328-1342, 2013. https://doi.org/10.3837/tiis.2013.05.022
M. S. John, P. SumaLatha, M. Joshuva, "A Comparative Study of Index-Based Searchable Encryption Techniques," International Journal of Advanced Research in Computer Science, 6(3), pp. 13-15, 2015.
G. M. Landau, U. Vishkin, "Efficient String Matching with k Mismatches," Theoretical Computer Science, Vol. 43, pp. 239-249, 1986. https://doi.org/10.1016/0304-3975(86)90178-7
J. V. Leeuwen, "Handbook of Theoretical Computer Science," Elsevier MIT Press, Vol. A, pp. 290-300, 1990.
Z. Mei, B. Wu, S. Tian, Y. Ruan, Z. Cui, "Fuzzy Keyword Search Method over Ciphertexts Supporting Access Control," KSII Transactions on Internet and Information Systems, 11(11), pp. 5671-5693, 2017. https://doi.org/10.3837/tiis.2017.11.027
G. Navarro, "A Guided Tour to Approximate String Matching," ACM Computing Surveys, 33 (1), pp. 3188, 2001. https://doi.org/10.1145/375360.375365
P. H. Paris, N. Abadie, C. Brando, "Linking Spatial Named Entities to The Web of Data for Geographical Analysis of Historical Texts," Journal of Map & Geography Libraries, 13(1), pp. 82-110, 2017. https://doi.org/10.1080/15420353.2017.1307306
D. X. Song, D. Wagner, A. Perrig, "Practical Techniques for Searches on Encrypted Data," in Proc. of 2000 IEEE Symposium on Security and Privacy, pp. 44, 2000.
S. Song, J. Zhang, X. Liao, J. Du, Q. Wen, "A Novel Secure Communication Protocol Combining Steganography and Cryptography," Procedia Engineering, Vol. 15, pp. 2767-2772, 2011. https://doi.org/10.1016/j.proeng.2011.08.521
D. R. Stinson, "Cryptography: Theory and Practice (CRC Press Series on Discrete Mathematics and Its Application)," CRC Press, pp. 1-20, 180-184, 1995.
M. Strizhov, Z. Osman, I. Ray, "Substring Position Search over Encrypted Cloud Data Supporting Efficient Multi-User Setup," Future Internet, 8(3), 28, 2016. https://doi.org/10.3390/fi8030028
D. M. Sunday, "A Very Fast Substring Search Algorithm," Communications of The ACM, 33(8), pp. 132-142, 1990. https://doi.org/10.1145/79173.79184
N. H. Truong, "A New Digital Image Steganography Approach Based on The Galois Field GF(pm) Using Graph and Automata," KSII Transactions on Internet and Information Systems, 13(9), pp. 4788-4813, 2019. https://doi.org/10.3837/tiis.2019.09.025
N. H. Truong, "A New Approach to Exact Pattern Matching," Journal of Computer Science and Cybernetics, 35(3), pp. 197-216, 2019. https://doi.org/10.15625/1813-9663/35/3/13620
N. H. Truong, "Automata Technique for The LCS Problem," Journal of Computer Science and Cybernetics, 35(1), pp. 21-37, 2019. https://doi.org/10.15625/1813-9663/35/1/13293
Varsha, R. S. Chhillar, "Data Hiding Using Steganography and Cryptography," International Journal of Computer Science and Mobile Computing, 4(4), pp. 802-805, 2015.
R.M. Yadav, D. S. Tomar, R. K. Baghel, "A Study on Image Steganography Approaches in Digital Images," Engineering Universe for Scientific Research and Management, 6(5), pp. 1-6, 2014.
W. Yunling, W. Jianfeng, C. Xiaofeng, "Secure Searchable Encryption": A Survey, Journal of Communications and Information Networks, 1(4), pp. 52-65, 2016. https://doi.org/10.1007/BF03391580
L. Wei, H. Zhu, Z. Cao, X. Dong, W. Jia, Y. Chen, A. Vasilakos, "Security and Privacy for Storage and Computation in Cloud Computing," Information Sciences, Vol. 258, pp. 371-386, 2014. https://doi.org/10.1016/j.ins.2013.04.028
B.B. Zaidan, A. A. Zaidan, A. K. Al-Frajat, H. A. Jalab, "On The Differences between Hiding Information and Cryptography Techniques: An Overview," Journal of Applied Science, 10(15), pp. 1650-1655, 2010. https://doi.org/10.3923/jas.2010.1650.1655

KSII Transactions on Internet and Information Systems (TIIS)

A Novel Cryptosystem Based on Steganography and Automata Technique for Searchable Encryption

Abstract

Keywords

1. Introduction

1.1 Background

1.2 Contributions

1.3 Organization

2. Preliminaries

3. Main Results

3.1 A Novel Cryptosystem

3.2 Automata Technique for Pattern Matching on Encrypted Data

4. Conclusions

Acknowledgements

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)