(compile: small white)
author introduction: Mr. Rudder, one of the President and co-founder of OkCupid, author of “Dataclysm” book. In this article, he pointed out that the network service provider to collect user data is one of the necessary means of perfect service. But the data collection must have specification standard for monitoring the bottom line.
OkCupid, a dating sites involved in construction, I declared that this summer by a small number of user data on this test, proved that they used in the matching algorithm is very effective. We have been using the site data to complete some tests, we focus on these tests because these results will help us improve the user experience. Test is very simple, and each data provider were told at the end of the test.
but people know what we do after the test, which is very inconsistent. In retrospect, it is the way we publish test results angered the public. We mentioned the test in our own blog, explained our test results to everyone, but not in the purpose of the test. User data collection and analysis of sensitivity and complexity. We on the analysis of an important issue has become so annoying, and many people even think that we don’t put the user in private feelings life seriously. It’s a disturbing thoughts, even if it isn’t true. However, we can release the test result for us the way to apologize to the public, but the problem caused by strong protests brought bigger issue, OkCupid and any other company couldn’t find a solution.
and other experts in data, I worry that the debate miss some valuable opportunities – a debate in June, Facebook said it has changed its new algorithm. Internet has accumulated a wealth of information in the information hidden behind the immeasurable social potential. Users to provide data to help them improve their web site, make a profit; This is a well known fact. But the same data may also hinder the social recognition and find new science.
OkCupid experiment to make a better match for the ultimate goal of the tested our assessment of commonality. For any two units, we usually with similar interests between two people to evaluate the possibility of a match. But in this test, we use the “placebo” replaced our speculation, “placebo” is for one pair of random Numbers of users. In fact, we have neglected the common interest in predicting the effect of attraction between two people (based on the possibility of “opposites attract” or common interest has nothing to do with the appeal of ideas). But other users to choose the other half of information – such as profiles, photos and personal statement, etc., we are still retained.
we found two people with the same interests and hobbies can actually get along better, but we also found that how to define the similarity degree has an important influence on the result. That is, people doesn’t just find a lot of similarities, even like us successful website – we still need to pay attention to the commonality. We changed the interface of OkCupid accordingly, and now we are more emphasis on value of similarity degree.
at the same time, we put the interests of “definition” belong to our psychological archives, and in the past decade we found in people of hundreds of other things together. In these findings, there are women the harshness of a judge are twice as many men judge, there are people in the eyes of beauty will be like the physical numerical exponentially — such as the Richter scale, has a political role in the date are not as important as you think, there are white like discussing their hair.
our job is to help people to come together. For this reason, we in how people come together to do a lot of research on this problem.
the rest of the site, has a different purpose, thus also working on other things. To sum up, we are trying to understand human nature. Social networking sites are reshaping of sociology, it the sociology from the past traditional questionnaire liberated and laboratory and into the real life.
on the Internet, you have a good friend, lover, also have the enemy, there are so few times you strong even feel nobody know what you’re doing. Indeed appeared on the surface, of course, in your computer automatically record with all of this. Once collected and to personal information, the recorded data, no matter is to collect the test or the user directly, to be able to tell us, our life is a what appearance.
OkCupid occasionally work with scholars outside the company to improve our findings, but some companies have won the internalization of the partnership. Facebook is set up for yourself a world class team, special analytical data results of their academic value. As in the past year, they tracked the our way in order to study the rumor spread and share to status updates on memes; They released a couple of friend relationship between research papers (surprisingly found in this paper, a decentralized friend networks can make marriage more stable); They also track when people started collective migration to the city center south-east Asian village population decline.
Google has to invest in social studies. Seth Stephens – Davidowitz, Google a social research scientist, recently used search data to estimate the gay population of American society. He proved the opinions of intolerance is how to keep two people close relationship. He showed us a set of images, and the emotional toll of repression: search “my husband is gay” content of the ban on gay marriage in those states are more common, the same situation such as Craigslist anonymous gay sex on the number of posts. As he put it, “there are a lot of secrets in the United States is not open, these can be directly attributed to the gay intolerance.” He said, we collect data to explain.
the possibility of these studies according to the environment, such as identity different and different, its uniqueness is can find hidden story, because the data revealed that what we are doing, not just what we are saying or what we want to do. Facebook, in particular, because it is popular all over the world, making it the researchers can learn other researchers can’t come into contact with the crowd. I once saw the OkCupid message pattern is how to sell our prejudice against blacks users – this prejudice still exists even coastal, high-level, looks progress among the audience. At Google, they also found that the number of americans search racial jokes every year still holder millions of times. These are all worthy of our understanding.
there are still many ethical issues to be solved. I find sites and their most direct communication between the user needs to be improved. We how to protect the privacy of personal information at the same time, continue to make the lack of obvious personal information data is its significance? And how can we ensure that the user will not in any risk? What is in the permitted range, what is should be banned?
traditional science for many years, have been looking for the answer. Data science has begun to set standards, but it must be mature. I hope I can have more cooperation with the science and technology company (and less criticism), for scientists and scholars, we will try to change the past research method into our new media.
we are living in a world of less and less privacy. Technical mastery of the important part of our life. This is a fact that these networks by us creators and users of co-creation.
we all know that technology companies have created great wealth, but many in the industry believe that there are some knowledge and value. I was one of them, at the same time I also eager to see the realization of the value: from what we know new knowledge to the benefits of a constant.