I like mathematics, especially mathematical modeling, which transforms an authentic problem into an
abstract math model, solving it in mathematics and then back to the real life, making a better world. My undergraduate
study in Information and Computing Science gives me a way to walk through the icy beauty of mathematics. Experienced both national and international mathematical
contests in modeling, I learned how to apply the theory into a practical scenario. During my graduate study in
Information Security, I encountered tremendous engineering problems, realized that I need to explore more and pursue deeper both in academics and engineering practices,
which lead my path to PhD study.
Research Interests Formal Verification, Program Languages, Systems and Security, Artificial Intelligence as well as these theories and technologies applying in
On the personal side I enjoy sports (workout, running, skiing, swimming),
dancing to music, reading, and traveling to interesting cities.
Wang Sai, Guo Yanhui, Wu Qiuxin, Liu Yuandong,
"A detection method of Android application malicious behaviors based on Xposed framework",
Sciencepaper Online12: 1264
/> Mobile Application Data Analysis
When you click the button on your app, what is it really doing?
The mobile data we are studying here is mainly focused on mobile application online reviews, and mobile App API data.
Our laboratory have years of experiences in software security analysis, including malicious software detection,
code security audit and security reinforcement in software etc.. We thus have collected huge data about
the software stored in our local server, and we find out these data are becoming heavier and noiser, in which itself have
a great value in informaiton reuse. In this situation, we start this project to clean and reuse the data we have,
to make a deeper analysis on these data in different angles, urshering in a new way to study mobile app security.
Users download apps from app stores, and write reviews to share their experiences about the app performance,
these reviews contain valuable information for app developers, also attract to latent users. Our work on reivews analysis include
features extraction and text classification, as well as sentiment computation etc. With respect to App API, we parse the log file
and record the parameter values of API, feeding them with proper mathematical models, by doing this, we propose a novel approach to detect
malicious apps, distinguished from our previous works which focus on code analysis.
Facing the huge data computing that consumes great resources, I am also doing some research on algorithm optimization
in machine learning, trying to improve the simultaneous matrix diagonalization in principal component analysis(PCA).
Yuandong Liu, Yanwei Li, Yanhui Guo and Miao Zhang,
"Stratify Mobile App Reviews: E-LDA Model Based on Hot Entity Discovery",
confernce: SITIS 2016
Shiting Xu, Yuandong, Liu, Yanhui Guo, Guoai Xu,
"Malicious Application Dynamic Detection In Real-Time API Analysis",
conference: Smart Data 2016
/> Mobile Network Security Authentication System
Is the app safe before it is uploaded to App stores?
Although there are 460 million Adroid active users in China, making it the world's largest Andriod population,
only 30% of them can access to Google Play, therefore 70% of users discover daily Android apps through
third-party app stores. Android developers are free to distribute their apps in any intended approaches,
including publishing into third-party app stores. There are more than ten third-party app stores in China
used by local people every day, as a result, the trustworthiness of app stores in China is an open question.
This project detect each APK file and repackage it with an authentic signature before upoading to app stores.
The whole procedure of APK detection includes static analysis and dynamic analysis, as well as malicious app searching engine etc..
I am responsible for Android dynamic analysis, based on an open source project Xposed Framework,
I developed an Android system API monitor module . The module can record and modified the parameters of API
involked by the running app, it can also notify users the details of app behaviors, e.g., text sending, internet connecting,
audio recording, camera involking etc..
The source code in my charge can be downloaded here
I also co-authored a Chinese paper, "A detection method of Android application malicious behaviors based on Xposed framework",
Sciencepaper Online12: 1264
/> Personal Tour Sites Planning System
When you are going to travel places with which you are unfamiliar, how can you schedule your tour sites
using your time and money to the limit? Would you like to share your findings with other tourists who
haven't been there during your trip?
This project provide tourists a tool to organize their trips according to their personal prefernce, and a community
to communicate with people who are mutually fond of same tour sites. The system is designed in three-layer web structure:
Business Logic Layer(BLL), Data Access Layer(DAL), User Interface(UI). We build B/S database application system,
implemented the system with ASP.net framwork in Visual Studio. We also developed an algorithem to optimize the route for users,
presenting the tour sites dynamically on the map by invoking Baidu map API.
How can we determine an effective, feasible, and cost-efficient water strategy to meet the projected water needs of a country in 2025?
At present, the shortage of fresh water resources has seriously restricted the development of many countries in the world,
so it is particularly important to develop rational and effective water resources strategies.
In order to meet the water demand by 2025, this paper takes China as the research object,
analyzes its water resources data, combines with China's economic, ecological and environmental factors,
and finally establishes freshwater resources strategic planning model.
In order to solve the problems in the water storage and transportation, seawater desalinization and water resources protection,
we made maximization of social benefits S(X) and economic benefits J(X) as the goal, the water storage, transportation,
desalinization and water resources protection as the constraint conditions, and set up a multiple objectives programming model.
In solving the model, we made the social benefit as the constraint conditions, and then transformed it into a single objective
programming model. After the model was solved, the optimal allocation of water resources was achieved. In particular,
the application field of the water resources optimal allocation model can be cities, countries and even the whole planet.
The model has general applicability. Meanwhile, the model is also applicable to allocation of other resources,
such as oil, natural gas, minerals etc..
Michelangelo, in creating David said, “I just chip away the stone that doesn’t look like David.”
Looking at data, I sense that “David” lies inside - a pattern. Chipping away, the pattern appears from noise.
The future is hidden there, correlations waiting to be found, and more clues waiting to be unlocked through clustering.
The right model is as critical as an artists’ choice of chisel. Mathematics gives me a special perspective to see meaning in data.
To find connections between numbers representing the past, and what they mean for the future.
I believe that as a data scientist, I am a sculptor, an artist.
The ability to sense the pattern is the key to data analysis, I am always fond of the TED talk above.
Data is the carrier of information, the world is about information, every time we scare becuase we have
no knowledge of what we are dealing with. Once we have data, things change a lot, connecting business in different fields,
that is what the world heading to.
Passion in Data Science
During the national mathematical modeling contest, my team members and I performed analysis on chemical compounds
and physical observations for thousands of red wine samples. Provided with a sample of well and poorly rated wines,
we were tasked with predicting how others would perform. Observing the data,
I realized we would need to build a model to identify classes of wine from the rules that the ranked samples followed.
Spurred by my grasp of the task we faced, we discussed machine learning algorithms that would provide a solution.
We settled upon a Support Vector Machine model - a supervised learning algorithm - using the training set to
calibrate our classifiers, we successfully separated the wine samples. I am Michelangelo.
It is my job to bring David from this stone, so that you can see him yourself. I am a data scientist.
Data Scinece Is More Than Science
The amazing thing in data sicence is that it connects all the world, sicence, technology, art and business. This is
the world I want to explore, fueling the world with data.
My Way of Thinking
I believe that art and science are two sides of the same coin, pursuing deep, universal, eternal and meaningful things in life.
Both artist and scientist are sharing the sharpness of their eyes, seeing what people can not see.
In my spare time, I would like to observe the world in different angles, capture what touches my mind and makes me thinking.
I want to leave
Beauty Quiet Deep Mystery
To seek the truth
One day my eyes are blurred
Still I can feel you with my body
Touch you with my hand
Getting closer to me
Then you know how beautiful I am
Do not push me away
Let me in
We are both young
We do not know
We leave each other behind
To find the perfect rose
One of common roses
You tamed me
You are my the only rose
We are all busy
To what end
You never feel
Come to me
The day faded away
Tears or rain
To where my soul would be free
The air we breathe
The rain we taste
You and me