The following sections explore the design and process principles which enable UI2CODE to generate these results.

Architectural Design and Process Breakdown

The following diagram offers an overview of UI2CODE’s architectural design.

With it, the system’s process can be briefly summarized in the steps shown below:

This process roughly corresponds to four steps. First, GUI elements are extracted from visual drafts using deep learning technology. Next, deep learning techniques are used to identify the types of GUI elements present. Third, DSL is generated using recursive neural network technology. Finally, the corresponding Flutter code is generated using syntax-tree template matching.

The following sections discuss key steps in the UI2CODE system in detail.

Background/Foreground Analysis

The singular purpose of background/foreground analysis in UI2CODE is slicing, which enables a direct determination of the UI2CODE output’s accuracy.

The following white-background UI offers a straightforward example:

After reading this UI into memory, binarization is performed on it as follows:

def image_to_matrix(filename):

im = Image.open(filename)

width, height = im.size

im = im.convert("L")

matrix = np.asarray(im)

return matrix, width, height

The result is a two-dimensional matrix, which converts the value of the white background in this UI to zero:

Only five cuts are needed to separate all of the GUI elements. There are various ways of cutting these apart; the following shows a cross-cut code snippet and is slightly less complicated than the actual cutting logic, which is essentially a recursive process:

def cut_by_col(cut_num, _im_mask):

zero_start = None

zero_end = None

end_range = len(_im_mask)

for x in range(0, end_range):

im = _im_mask[x]

if len(np.where(im==0)[0]) == len(im):

if zero_start == None:

zero_start = x

elif zero_start != None and zero_end == None:

zero_end = x

if zero_start != None and zero_end != None:

start = zero_start

if start > 0:

cut_num.append(start)

zero_start = None

zero_end = None

if x == end_range-1 and zero_start != None and zero_end == None and zero_start > 0:

zero_end = x

start = zero_start

if start > 0:

cut_num.append(start)

zero_start = None

zero_end = None

The client UI is basically a vertical flow layout, for which a cross cut followed by a vertical cut can be made:

At this point, the X and Y coordinates of the cut point are recorded, and will form the core of the component’s positional relationship. After slicing, this yields two sets of data: six GUI element pictures and their corresponding coordinate system records. In subsequent steps, component identification is performed using a classification neural network.

In actual production processes, background/foreground analysis becomes more complicated, mainly in terms of dealing with complex backgrounds.

Component Identification

Prior to component identification, sample components must be collected for training. Further, the CNN model and SSD model provided in TensorFlow are used for incremental training at this stage.

UI2CODE classifies GUI elements according to a wide variety of types, including Image, Text, Shape/Button, Icon, Price, and others which are then categorized as UI components, CI components, and BI components. UI components are mainly for classification of Flutter-native components; CI components are mainly for classification of Xianyu’s custom UIKIT; and BI components are mainly for classification of feed stream cards with specific business relevance.

Global feature feedback is needed for repeated correction of component identification, usually adopting a convolutional neural network. Taking the following screenshot as an example, the two characters of text in crimson red (translating to “Brand new”) comprise the richtext portion of the image. Meanwhile, the same shape style may be present in buttons or icons.