Listen to past episodes, read about the hosts or donate to the show at podcastinit.com
Brief Introduction
- Date of recording – June 10th, 2015
- Hosts Tobias Macey and Chris Patti
- Follow us on iTunes, Stitcher or TuneIn
- Give us feedback! (iTunes, Twitter, email, Disqus comments)
- You can donate (if you want)!
- Overview – Interview with Eric Schles
Interview with Eric Schles
- Introductions
- How did you get introduced to Python?
- What inspired you to take up the fight against slavery? Is there personal story behind this choice?
- Some of your work touches on the “Deep Web”. Can you provide listeners with some context around what that term means and role it plays in what you do?
- Tor .onion sites (Hidden Services) are examples
- Anonymous Web Experience
- Anonymity allows for illegal, immoral things like buying selling people
- Conceptually very important idea
- Bruce Schneier – Web technologies need to be more privacy aware
- Like a really scary version of “The Internet of the Old Days”
- Photos of young, exploited men and women
- Pedophiles are building communities, having parties through these hidden services
- Eric feels that Tor is an extreme
- Feels there had to be a way to protect the rights of legitimate while protecting against pedophiles
- Maybe a voting system?
- The Tor project feels that any compromise lessens the that’s so important for people in embattled or countries (Worded that poorly -Chris)
- No metrics on the amount of pedophilia that actually happens Tor – probably a lot
- Sexually abused victims of trafficking grow up damanged unable to do anything else
- Consumers of this type of porn were often themselves victims sexual abuse
- Structural dissonance which exists to create this problem society needs to be addressed
- Google puts the number to the anti-trafficking hotline at top of any trafficking search results
- Darren (Derek?) Hayes – redirect to trafficking resources when viewing advertisements for victims trafficking
- Why did you choose Python as opposed to any other tool for your search engine?
- Needed solutions quickly with the ability to evolve as needed
- Able to rapidly develop and incorporate new features rapidly
- Easy to scale as needed
- Flask is easier to prototype and iterate with
- Python data science tools make the analysis easy
- Able to finish a 2 year C++ project in 3 weeks using Python
- Doing data science in Ruby is challenging
- Pandas Dataframe galvanized the creation of a lot of other useful tools
- Vincent – write Python which compiles to D3
- Can you provide a high level description of the technical details the search engine that you created, and what it’s like to with Tor through Python?
- Directed search engine
- “It would be like if you went to Google but everything watched was Porn which you were uncomfortabl seeing and you sad”
- Get most case information through regular old detective work
- Person arrested / in holding yields phone number, other attributes that can feed the search engine
- Google can’t scrape the deep web
- Memex tool indexes the deep web – Eric’s search engine uses that
- Eric does design work for the Memex project
- Developed by the amazing Chris White
- Eric’s search engine uses the Tor driver in Selenium to .onion sites
- What are some of the technical and legal challenges that you experienced in the course of your work?
- Most of the technical challenges are around automated processing
- Legal structure provides some limits on what can be worked on
- Does your search engine try to infer who might be engaged in work voluntarily as opposed to those being forced into it their will?
- No, because they get all their case referrals from detective work
- You have to have been hospitalized or in some other way come the attention of the authorities for being deprived of rights
- Trafficking looks very different in different cultures
- Global similarities
- Afraid to say why if hurt
- Forced into having sex against your will
- Clear patterns of indication
- Urban versus Suburban versus Rural
- Fracking towns
- Demographics are very different – mostly men very women, LOTS of ads for sex workers
- Only helping people that want to be helped
- What was the most surprising fact you uncovered as part of research?
- Imagery of exploited children is so depressing and sad
- Without revealing anything you shouldn’t, are you aware of being set free as a result of your work?
- “Not my work, our work”
- Not an individual effort
- lawyers, analysts, larger DAs office
- Given the complicated socio-economic aspects of human and prosecution of those who are responsible, can you discuss of the moral and ethical considerations that you have confronted with while building these tools?
- Privacy is the biggest concern
- Open source book to teach colleagues at the DA’s office how program to in Python
- Sometimes Eric works at Civic Hall
- Are there any projects out there that you consider similar to you are working on?
- Thorn’s Spotlight tool
- Memex Project
- Polaris Project
- Datakind Anti Trafficking
- dosomething.org – more broadly focused – help center for teens
- RescueForensics – stage startup
- What would it take for other municipalities and law agencies to get started with using your tools?
- Go to https://github.com/EricSchles?utm_source=rss&utm_medium=rss
- Alert System and investa_gator
- Contact Eric at ericschles@gmail.com to collaborate
- How can our listeners get involved and help you with this Chris
- Tweet at @EricSchles or E-mail Eric
- Volunteer for any of the non profit anti-trafficking groups
- Message to the community: There is a world of good waiting to happen
Picks
- Tobias
- Chris
- Eric
Keep in Touch
- Twitter: @EricSchles
- Eric’s About.me page
More From Eric
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello and welcome to podcast. Init, the podcast about Python and the people who make it great. We're recording today on June 10, 2015. Your hosts, as usual, are Tobias Macy and Chris Patti. You can follow us on iTunes, Stitcher or TuneIn Radio. And please give us feedback. You can leave a review on iTunes or Stitcher. Contact us on Twitter. Send us an email at hosts@podcastandit.com. Or leave a comment in our show notes. And if you'd like to support the show, you can find some donation buttons on the site at podcastandit dotcom. Today we're interviewing Eric Schless. Eric, could you please introduce yourself?
[00:00:52] Unknown:
Hello everyone. I'm Eric Schless. I'm a data scientist at the Manhattan District Attorney's Office, where I work on preventing and combating human trafficking.
[00:01:03] Unknown:
How did you get introduced to Python?
[00:01:05] Unknown:
Actually, it was a few a few different ways, but the first real sort of jump towards that was, I took the MIT introduction to Python course in preparation for a master's in computer science, at NYU. And the program there focused heavily on Python, the masters did. And so yeah, I became somewhat of an aficionado, and eventually sort of moved towards mastery. But I also did the Khan Academy, lectures as well as Codecademy, and those were both wonderful resources for starting me my journey. Python is wonderful. I love the language. And I get to play with a lot of things because there's so many packages sort of implemented for you. Right? So, yeah, it's been very useful and helpful for, the work that I do because people have done the hard work for me a lot of the time.
So that's why I sort of stayed with Python over the course of my career. Now I use it at my day job, which is wonderful.
[00:02:07] Unknown:
Yeah. That's been a very common thread with everyone we've spoken to saying that 1 of the things that they really love about Python is the vast ecosystem of packages and libraries that they can draw on to do their daily work rather than having to start everything over from scratch. So
[00:02:23] Unknown:
Oh, yeah. Absolutely. Like, for instance, having, access to deep belief networks and, like, neural nets just, like, for free, basically, is super powerful. I also make use of the requests library a ton. I think I use that probably more than anything else, just because a lot of the things I'm doing is pulling down data from, like, open web sources. But then also, know, you have all the native stuff. Right? So, like, the string processing is fantastic. The native data structures are extremely intuitive, and then sort of the additional ones. Right? So, like, building up like, so for instance, Pandas, data frame is so helpful for transformations and data cleaning.
It it it's so much better than r, which is what I was using before for those 2 purposes. So, yeah, I haven't found a really reason to use anything else when it comes to flexibility or or or moderate performance.
[00:03:22] Unknown:
That's interesting that you say that you you're using it in the stead of r and haven't found anything for your particular needs where r was actually superior.
[00:03:31] Unknown:
So that's interesting to hear. Not for data cleaning. In terms of model selection, there's a bit there's a bit in r that I don't find yet in Python. But in terms of data cleaning and data processing, I have not found the same utility with r.
[00:03:47] Unknown:
So what inspired you to take up the fight against slavery? Is there a personal story behind that choice?
[00:03:53] Unknown:
Absolutely. So we have to go back in time quite a bit, actually. So hearken back to the summer of 1995. I was 12 at the time. And my family, my parents are a bit strange. They they sort of did the Globetrot thing when they were post college, and both found themselves in India in a rather remote village just for various reasons, mostly because they're nuts. And so we went back there to see where they had met. My parents wanted me to see where they had met, I I guess, for for strange reasons because it's a horrible little village. But I saw a man in the street with a metal collar around his neck, and he was begging for food and and and change. And, my father later explained to me that he was actually sort of not doing this for him. He was doing this because what happens is, occasionally in small villages, parents have huge families or people have huge families. They can't support their kids. Their farms drive, especially when you're in the middle of the desert.
And they end up putting their kids into slave labor, either begging or, you know, they sell them. It's really bad. And so I found about this when I was 12, which was perfect timing because I had just learned about slavery in America and, you know, I was taking, American history or something. And we learned how Lincoln had freed all the slaves, and and I was sort of confronted with this moment of, oh, no. It still happens everywhere else. This America got fixed, sort of, but not everywhere else. And so something inside me just sort of, like, broke. And then I spent, you know, a couple years sort of being depressed and trying to think about as much as I could, and then much of my late teens and early twenties thinking about how to stop slavery internationally.
And it led me to economics where I learned a whole bunch about how 1 could understand systems and, sort of implement policy solutions to this. But, you know, I needed to get my hands dirty with data. I needed to be able to build tools to be really effective, and, also, it's helped me build the developer community in Python is fantastic, and it's helped me build up this huge network of people who who are looking and willing to help. So that's been great. Yeah. And then I guess I guess, after that, I did sort of master's work and then computer science stuff. And so that that sort of, like, brought me to where I am today. But it's interesting. As I started learning more about slavery, it sort of, like, also came on the rise in America because of the Internet ease of of use with that, brought actually a lot more demand in America and allowed sort of traffickers here to sort of communicate and to reap, supernormal profits, which have been not great. So, technology has been both a blessing and a curse to America. It wasn't actually as bad a problem or at least it wasn't as well documented a bad problem. In the nineties in America, I was interested in international focus, but there's sort of been this rise locally, which is why I bring it up. And so using technology, hopefully, we can combat it in this country, and then I'll find more standard sort of policy solutions abroad because everyone's on the same policy page here in America.
[00:07:06] Unknown:
So some of your work touches on the deep web. Can you provide our listeners with some context around what that term means and the role it plays in what you do?
[00:07:14] Unknown:
Sure. Sure. So the deep web refers to, I'm being a bit, vague here, but it it relates to websites that require the Tor network among other networks. It's basically not the normal Internet. So the Silk Road, for instance, is a great example of a deep web website. You have to access it through the Tor network. You know, you have to, sort of dot onion sites are are all examples of of of deep web things. They're not easily scraped. They don't live in static IP addresses. Unless you know how to get to them, you can't always get to them. You can also, by the way, search, you know, normal websites like google.com or something through through the Tor network, and it's also anonymous, which is important. So, when you visit a website through the Tor browser, the Tor browser, essentially obfuscates your identity completely.
And so you have this, like, anonymous web experience where no 1 knows who anyone else is, and that's sort of the point. So there are a few caveats there. Some people have been able to map some pieces of this through a lot of statistics, but you still can't get at a 100% certainty who anyone is on the deep web. So how this plays into my work is not going to take a huge leap of intellectual thought. Basically, because everyone's anonymous, you can do things like post interest in, essentially, buying and selling people. So this happens on on some deep websites.
Standard standard scrapers, web scrapers typically cannot handle this. Although there are some tools in Python among other languages that do this. So you can make use of Selenium and then mechanize the Tor browser. There's a Tor driver, and you can use that to scrape the content on the deep web. But, you know, if you send the requests if you use the request library and try to just do a get against some deep web website, you will not get anything back. Actually, I've never tried it, so I'm not a 100% sure that that will happen. You might get something back, but I'm reasonably sure you won't. So standard HTTP request won't work. I don't wanna go into the internals of of why this or how this works though, just because, you know, then it's very long discussion. That's a whole show onto itself. Yeah. Yeah. That's right. That's a whole show onto itself.
But, yeah, the deep web is interesting. I actually am for privacy, like, aware web browsing. I think that if people are honest and don't use it for bad things, then the deep web is not necessarily a bad idea. I think, conceptually, it's very important, and I think you can build privacy aware systems that still give information that's pertinent. So I think also the web needs a bit of a redesign, if I'm being critical. And, you know, people like Bruce Schneider have have sort of talked about this. He does he does a wonderful rant. I got to see him in person recently, actually, which was really cool. And he does a wonderful rant about how we need to be making our our web technologies more privacy aware and and sort of stopping this aggregation thing. You can actually build systems that are still safe, that still allow us to get the information when it's necessary, but don't require overkill like the Tor the Tor experience. So but I think privacy is very important, and we should all cherish it. It's just unfortunate when some bad actors really, you know, sort of mess things up for the rest of us, but that's sort of where we're at. Because of this technology, we have to sort of, you know, go a little deeper into into this because of bad actors.
So to sum up, you can sort of think of the Deep Web as the Internet of the old days, but also a really scary version of that where pictures of young women and young men, like 5 or 6 year olds are posted. And, you know, these are women, young men, or or children, I should say, children that are exploited, and there's no way around it. And the deep web has allowed that practice to propagate and allowed people who are deviant to become increasingly more deviant because they can congregate now. They can have conversations. They can they can learn from 1 another. They build communities. They have parties. I can't say further than how I know that, but I can say that I know they have parties.
And it's terrible.
[00:11:29] Unknown:
So that's that's what the deep web is web is unfortunately afforded us now. You know, you don't even have to be a researcher to I mean, I didn't know about the parties per se, but I'm actually very interested in Tor because I feel like, as you pointed out, privacy is important. I mean, even not necessarily so much here in the United States, but I feel like it's very important for someone in China or Iran or whatever to be able to get access to a relatively free and unrestricted Internet just like we have. And I feel like Tor is a great tool to enable that. But anybody who has set up Tor in the Tor browser bundle or whatever and gone poking around to any really significant extent, you know, you can't walk 10 steps, metaphorically speaking, without stubbing your toe on a pedophilia site. I mean, it it's rampant in Tor, in Tor hidden service land, which I think is a really unfortunate thing. And so that's why I think projects like yours are so incredibly important because we need to not just say, well, Tor is Herpeto's.
I think we need to actually work to make sure that it's a tool that can be used effectively
[00:12:38] Unknown:
by the people who really need it. Right? Right. Absolutely. Absolutely. I think that I think you're you're a 100% right. I would actually I would argue for for something. So my fear is that you can't stop deviance. Right? Like, sort of, like, if you live in a society where all ideas are free, the bad ones are gonna get out too. And I'm not saying that we shouldn't have that. I'm think we absolutely need to have all ideas expressed unless they're, you know, something that actively damns others or, in other ways, causes them to have a loss of freedom. But right. So the point I'm trying to drive at is that because of of these services, I think that, Tor takes it to too far an extreme.
That being said, of course, if you're in a place like China, then or, you know, some places in Africa or the Middle East where or or even some places in South America. If you voice your opinions, you get killed. So for instance, I can't remember what city it was in, but there was a city that's using Twitter right now to get all of their news and updates and things. And there are these women that are going around and and young men who are going around and sort of providing insights and news because the news organizations have collapsed. And for speaking out about where the terror is happening, where the drug cartels are, where the where where the human trafficking is, where the the drug trafficking is, they're getting killed and their pictures are being put up on websites saying, you know, this is what happens to snitches.
And so for that kind of thing, something like Tor is necessary. I don't think we have the right system yet, though. I I think we're close, but there has to be a way to sort of protect the freedoms of the many from getting killed if they're doing something legitimate, but also allowing or disallowing folks like pedophiles to run rampant. And maybe that's, you know, you have some some voting of the entire population. Like, the entire population on Tor says, no, this website is bad. It gets shut down that day. I mean, maybe democracy is not the right answer, but, you know, I feel like there has to be a mechanism.
Because right now, it's gotta come from the it's gotta come from the people. I mean, policing is going to happen, and we're not gonna stop. Right? Predictive analytics and data science is being integrated into law enforcement more and more, which I am really proud of. I think we need more understanding of technology, not less. But at the end of the day, I think there needs to be some sort of protocols in place to to stop to stop it on a on a mass level and to be self policing.
[00:15:26] Unknown:
Because right now, it's not, and it's that simple. No. Absolutely. I mean, the Tor project feels very strongly that your assertion is wrong, essentially. Right? And that and that in providing the tools to allow people in those crazy distressed places like, you know, like you said, you know, the these these embattled cities or Syria or Iran or China who are under serious duress. And as you mentioned, could die literally if if the fact that they were publishing these things gets out. They feel like absolute anonymity is the only thing that works, is the only thing that can help. And I I'm pretty sure it would be really interesting to get a response from Tor to your assertions.
But I'm guessing having, you know, having run a relay for a while myself and having read a bit of of literature and, you know, the mailing lists and such, their response would be there is no way to provide that kind of a mechanism without lessening the effectiveness of the anonymity that Tor provides. And it's really interesting. You guys are kind of at I feel like I'm listening to a debate between Richard Stallman and, you know, I'm not sure someone from the from the proprietary software community here, but it's it's it's a really, interesting polar situation.
[00:16:46] Unknown:
Well, okay. I am absolutely not from the proprietary community. I believe in open sourcing everything.
[00:16:54] Unknown:
I think no code should be should be closed source. I I was I I totally understand, and I apologize. I was using an analogy. I was just saying that your stance is is a viewpoint at 1 end of the spectrum. Tor is seems to be at at kind of the other end of the maybe you're towards the middle actually. So you're not saying tor is bad, you're saying tor is good, it's just imperfect and we need ways to stop it from being this den of pedophilia, which in a in a way, it currently is because as I say, even as a casual explorer of the of the deep web, you kinda almost can't help but bump into this stuff. And I I honestly found there there are other people who feel the same way. Right? Like there's the, what do they call it? The hidden hidden wiki or something like that. And there's now a censored version that basically so you can find good sites on Tor that aren't pedophilia oriented if you are interested in anonymity and privacy and things like that. So you're not the only 1 who feels this way. And I actually think it's at least worthy of some research to see is there a way that we can build some kind of safeguards into the system to make what you're saying a possibility.
[00:18:03] Unknown:
Right. And and I think that would be all I would ever argue for. Now I don't want to, you know, get into, the most ardent of hardcore debates about this because at the end of the day, I am I'm not a novice when it comes to the deep web, but there are people more capable of having this debate than I am, if that makes sense. I I'm not, by any stretch of the imagination, a master of network things. I am very, very good at data science, but that's that's where my skill set lies. But, look, there is this problem. Right? There is this problem with the deep web, and it affects real people that, honestly not that it would be impossible, and it absolutely happens on the regular web. Right? So I think making the deep web less anonymous centric will probably not solve the problem. But if it means that, I don't know, a 1000 kids, 10, 000 kids, like, where do you put the where do you draw that line?
Where their their their lives are better because you can't profit off of, you know, or maybe it makes someone less likely to even try to profit off of them as sexual objects. And it honestly depends. It's a hard debate to have because you don't know how much of this stuff is recycled. I don't know of any metrics on the amount of actual pedophilia that actually happens. And, you know, if you took a if you took a statistical study and actually measured it, could you even figure out how much legitimate pedophilia there is going on? I don't know the answer to any of these questions, but my guess is it's probably a lot. Right? Because you operate with essentially no real safeguards, and you operate in such a way that there's no real unless you're really stupid, there's no way to get caught. And so I just I worry about these things.
And I say this having met people who have been I mean, in my line of work, who have met people who were victims of child sexual abuse and child sex trafficking. And They don't grow up to be good people. I mean, not by any fault of their own. They end up becoming trafficked for a lot of their life. These women start out this way, sometimes these men start out this way, where they're sexually abused, and they end up believing that sex is the only way that they can make money because they didn't have a good upbringing. And so, you know, if you can't curtail the child pornography in the pedophilia industry in a very real way, you can't stop sex trafficking because it's a culture thing.
Yeah. Sorry. It's getting too depressing.
[00:20:52] Unknown:
No. No. It's not at all. I mean, I think I think that's why we had you on the show. And moreover, I think that you should feel good about the fact that so let's sort of say, like you said, I totally agree. It doesn't make much sense perhaps to get into a hardcore debate over is Tor the right answer or not or should Tor change. Because what we're here to talk to you about is the awesome fact that you have made this incredible tool set for the forces of good. Right? And you are actually policing the Deep Web to try to help root out these incidences of human trafficking.
And I think anybody with the slightest shred of morality in them can agree that that is absolutely a really good thing. And as I say, you should feel incredible about it.
[00:21:43] Unknown:
Oh, sure. No. Most days I I get to go to sleep saying, did I did I regret anything today? And the answer is usually no. It's just it's hard day after day to look at the the problem domain and not feel a little bit worn down just because you're seeing so much of the worst tragedies in humanity, I think, that we've ever seen because it's structural and baked in to systems that you believe are there to help people in a sense. Right? Because technology is supposed to make our lives better, and we're seeing it make our lives worse. And the the consumers of the of the trafficking industry, these are not people that are poor. These are usually very wealthy individuals.
The sex trafficking industry is doing really well. It's a multibillion dollar industry. And so you have to ask yourself, if it's a multibillion dollar industry, who are the people that are paying all that money into it? And it's, unfortunately, wealthy individuals. Maybe you would think that the people with with power and money and then, obviously, it's nowhere near everyone. Right? But there is a subclass of wealthy individuals who are consumers of this this product of of human suffering, more or less. And so there's some level of, sort of, like, structural dissonance that's going on in our society, and it's something we need to address.
And that's what I'm trying to do, honestly. I'm trying to address it. And I should note also that, usually, the consumers of the sex industry were in some way sexually abused themselves at a young age. It's not everyone, but there is a subset of this population. There have been psychological studies on this. I can't think of any of them off the top of my head, but they definitely exist. I know, anecdotally, sex therapists who have worked with people who are recovering from sexual addiction, and they went through sort of similar things. And these are guys who were worked in finance, who had very high stress jobs, but, you know, were legitimately brilliant, but couldn't see how much they were hurting someone else. So there's some I think there is some structural dissonance that allows this sort of system to persist.
And we're only starting to address it, which is the really depressing part. But, you know, hopefully, we can do something meaningful.
[00:24:19] Unknown:
Absolutely. And just to quickly address your point about technology that's meant to do good being used for something bad. I mean, that's been the case ever since the cavemen picked up a stick and said, hey, or, you know, I can build something with this or, like, and then use it to club, you know, their fellow cavemen over the head. So it's like and since technology has existed, it has always been that dichotomy of, do we use this for good or do we use this for evil? Right? Like, I mean, just about every single piece of technology ever can be used for either.
[00:24:53] Unknown:
Oh, absolutely. I'm not so, naive that I believe that technology will always be used for good, but it's about and I should clarify this. So I appreciate you, like, sort of bringing that up. It's about the level at which with technology is being used for bad as opposed to good. Right? So, like, someone picks up a stick and hits this fellow caveman, I don't know if we should have an outcry. Someone uses technology to deprive someone of the ability to have a life. I feel like that's taking technology to a place where I don't know if there's a way back from unless we stop it now.
[00:25:35] Unknown:
Yeah. Another anomaly of our modern age is the scale at which our technology allows an individual to operate whether for good or for bad. So taking it back to the stick analogy, you know, yes, somebody can pick up a stick and use it to assault their fellow caveman or fellow human or whatever. But that's potentially isolated incident and the potential magnitude for that level of harm is inherently limited. Whereas with our current technologies, whether it be the Internet or, you know, various forms of software or even our nuclear arsenals, our capacity for harm is drastically increased. And we are increasingly isolated from the effects thereof, which really just makes it easier to dehumanize the victims of that that harm. So because you you are 1 or several steps removed from the actual suffering, you're not not as visceral for you. Absolutely. So
[00:26:43] Unknown:
dissonance that I'm referring to because I wasn't I was vague, and I realized that now. Because we have so it's not like with a car. So when motor vehicles and mechanization happen, we were still close to the things that we were mechanizing. And so it was clear, if you hit someone with a car, they die. With computers, we've created this layer of abstraction from the results of the technological innovation that has happened around us. And so there's this dissonance of reward and risk and also of cause and effect. And I think that that's something that we need to bake into our systems a little bit better. And there are people that are trying to do this. I can speak to 2 such projects.
So actually, 3. Google has been actually wonderful about this. And I have some friends at google.org who work on anti trafficking things, and they're wonderful. They do great stuff. 1 of the things they did was they put the anti trafficking hotline at the top of any search related to trafficking. So that means if you're looking for it, you know, it's sort of like a reminder. It's like there are victims here, and that's I think that's yield some yielded some results, but it may sort of like, okay. So this is, you know, these are eyeballs. We're trying to bake that in. Another example of this is, some work being done by professor Darren Hayes over at Pace University. He's a good friend, and he is working on the mobile space. So what he's doing is creating fake profiles on places like Tinder and, I believe, OkCupid and other places like this. Racy young women. And if you sort of click on their profile, it'll be it'll send you to trafficking awareness things.
I also am aware of there's a group up in Boston that I believe does this as well called Hunt Alternatives, and they they work specifically in, like, advertising space. So they'll put up an ad for a girl. You click on the website for you know, if you wanna buy, and it'll tell you, like, statistics about what happens with trafficking. And so, you know, it's sort of like trying to bring awareness to the people that are engaging in this practice by confronting them where they where where they live almost, which is I don't know if it's the best way to do it, but it's something. And I think it's an evolving conversation that will continue to hopefully yield positive results.
[00:29:05] Unknown:
So redirecting our conversation a little bit, why did you choose Python as opposed to any other tool for writing your search engine?
[00:29:12] Unknown:
So I chose Python for a number of reasons. The most important 1 was we needed solutions quickly. Right? So this problem is happening now. We need something that a tool that could sort of handle evolving needs from the people that I work with at at the Manhattan District Attorney's Office as well as something that would be supportable, readable, easy to use, and was robust. So Python sort of fit the build really, really well. I was able to not only sort of do rapid development around each of the tools. I was able to incorporate new features quickly because, Python is so flexible.
I was able to, you know, take in modules as needed, and start using them right away and also maintain the code base fairly easily just because Python is somewhat self documenting. Right? It's basically pseudo code. It's so much easier to read than c plus plus. And, also, with the right sort of resources, computationally, it was easy to scale up as needed, which was important. Also, I made use of the Flask framework heavily. Web development is probably easier in Flask than it is, I think, than many other sort of web technologies. Right? So there are there are sort of notable things here, right, like Ruby on Rails and and and Sinatra and and, you know, the MEAN stack. But I think Flask is easier to sort of temp template out, prototype, and then check to see if you are doing things correctly.
And so using armed with basically Flask, so I could create a really simple web interface for my analyst to use, and all the data science machine learning packages and all of the wonderful high level wrapped web scraping technologies that Python comes with, was able to build tools robustly and and quickly. So just to just to delve a little bit into what I was able to come away with in terms of tools, to date, I've built around 30 tools at the open data science conference. I got to display 2 of 3 of them 3 of them. Well, sort of display 2 of them and talk about a 3rd.
So you saw a very small subsection of what I've been able to build while the district attorney's office because Python is so fast and flexible and easy to develop it. 1 of my friends sort of thinks says, Python optimizes developer time, and I think that's absolutely true. An example of this would be the 1st day that I started on the job, they asked me to they sort of planned out what they thought would be the next 2 years of my life. Someone had been working on this tool in c plus plus for about 2 years, hadn't finished it, actually had retired. I was able to use existing modules and finish the tool and put into production about 3 weeks later after after my start date. And I don't think it's because I'm a better developer. This is a senior guy. You know, he he had years of experience, but he'd been working in c plus plus. And so the tool never got finished.
And I was using I was doing really standard things. It just it just scraped it just pulled information out of a PDF. That was all that was all they needed to do and just turned it into an Excel document. And it took him years years to do this because, it's not as flexible. You can't sort of stand on the shoulders of giants. And I think that's what Python allows you to do.
[00:32:18] Unknown:
That's all great information and a great story of how somebody can use Python to explore a problem domain and come my way with something useful in a pretty short amount of time versus maybe some other older or more traditional programming languages.
[00:32:34] Unknown:
Well, but even also even, you know, what you mentioned with regards to Ruby is, for instance, the data science tools that we have in Python just really don't exist over there. I mean, sure, there are certain things, but there is not nearly the rich palette of options that the Python community has. And I'm not saying that, like, in terms of, well, Python's better. But in this case, for this particular problem domain, it is provably better and and easier and you've proven that with your explanation. Yeah, sure. I would I would say that suing data science in Ruby is challenging. I think people can do it, but it's definitely not as easy. There is an extremely robust community,
[00:33:12] Unknown:
and I think this is mostly because of the data frame that Wes McKinney built. I think having access to that sort of galvanized people around this and said, oh, you can do data science in Python and sort of made it a reality. Yeah. I actually found his talk really wonderful at the Open Data Science Conference. If you guys didn't see it, you totally should. It was great. But yeah. Yeah. So I think that data science just is easier. You know, you have SK learn. You have you have a really robust data frame. You have traditional modeling through stats models. You have really great and easy visualization tools. So matplotlib as well as Vincent, which lets you compile basically 2d3.
It's so easy to build really robust visualizations. So you sort of have this full stack data science toolkit at very low cost. With, like, about 5 or 7 lines of code, you can build a visualization for a a social network. You I don't know of a lot of other programming languages that allow that to be done that easily. I mean, I think this may exist in R, but it's far more it's far more difficult. Like, I have a colleague that works in r at work, and he he hasn't figured out how to do sort of this yet. So I'm trying to convince him that he should just use Vincent, and then it'll be okay. But, yeah, I would I would say that doing data science in Python is easy and fun as long as you like math a lot.
[00:34:39] Unknown:
So can you provide a high level description of the technical details of the search engine that you created? And if you don't mind me inserting my own personal spin here, maybe give us a little bit of a sense of what it's like to interact with Tor through Python.
[00:34:54] Unknown:
So, basically, what's happening is, it's a directed search engine. Right? I'm not scraping the entire Internet because that would be silly. I'm not Google. I don't have those resources. There's no reason for me to do so. But what I am doing is I'm starting at a seed the same way Google does, except my seed is gross websites, like the grossest websites you could possibly imagine. Like, it's like if you went to Google except everything was, like, porn that you felt uncomfortable watching and were saddened by seeing, that would be my seed.
And then what I do is I traverse all of the links. So, basically, there are a bunch of links on web pages. Right? And you can traverse these. So I use, recursion to traverse these websites, and I go, like, n levels deep where n varies depending on the website and what I'm actually searching for. So it's also a directed search engine. So it's a little bit more manual than than Google. Right? So you query for a thing, and then it gives you, like, the 10, 000, 000, 000 pages in, like, a second. What I'm doing is I'm querying for specific information. So phone numbers, I'm querying for email addresses, and I'm also querying for text that I received or a analyst received for a case. So the way that we get specific case information is through honest detective work. People will sort of say, hey. We got this new case. It looks like trafficking.
You should take a look. And so, basically, what we do is we then we start a workup, and my my search engine goes to work. It looks for any more links. It looks for, further instances where this name, this phone number, email address, or similar sort of writing to an ad if there's an advertisement associated. If not, then it'll just go off of phone numbers because when someone's arrested, we get their phone information. So we have a way of contacting them if we don't send them to jail. And if we do put them in holding, then we still get their phone numbers, typically. And so these are hard attributes that you can then use to search the web and find more information about this person. If we find advertisements linking to back page, if we find other sort of prostitution y sort of websites that have are connected in some way in a hard attribute way to and by heart, I mean, you know, a clear causal link between this person and this website, then we we we think it's trafficking and we start our investigation in our interest.
Yeah. So right now, we have, like, 30 ongoing investigations, and I think 9 or 10 active indictments. It's not most of which did not come through this tool. I've only been there 8 months. It takes a really long time to bring an indictment, but I've been able to aid in a few of the cases since I've started and, you know, bring in more information more fast. So the faster you can get information, downloaded, which is essentially what I'm doing. Because right now, for this, people were just using Google or whatever. So I'll I'll talk about that in a second and how this relates to the deep web and why my tool's helping a little bit there. Basically, I've been able to take this thing that was manual and took maybe weeks to taking that and turning it into days or hours or even minutes. So if can do this, then you can work more cases at once. Your analyst is less overloaded, and then you save time and money, which is great for the government and also allows us to work more cases at once. Because we're a small team. Right? There's, like, 7 full time staff members in the human trafficking unit. Maybe I'm missing someone. Let me think.
5. No. There's, like, 9 or 10. And then there's, like, sort of part time people that help out from other units, but they're like, this is really depressing. We need more people on this, and so they sort of pitch it in, which is really wonderful. So so right. So talking about Google. So, obviously, Google doesn't scrape everything. It doesn't scrape the deep web. And so being able to scrape the deep web is really important. I do wanna say a lot of our deep web scraping right now comes from the memex tool, which was developed by DARPA. They index a lot of the deep web for us, specifically around sex trafficking.
I'm working on solutions that are not big enough for them to spend their time on, but are not small enough that it can't just be done by a person manually. So there's still some need for some automation. So the DARPA MIMICS project is being, sort of piloted out to a bunch of folks around the country. Obviously, DARPA is a national organization. For those of you who don't know what DARPA is, it's the Defense Advanced Research Programs Agency. Or yeah. That sounds right. And, they do a lot of wonderful amazing work, and they've given us this MIMEX tool among other places. And so that is the primary way that that deep web scraping happens. But like I said before, what my tool does is I just use Selenium. I know what dot onion sites I want. I sort of say scrape this, and then it follows, you know, the links the way the Tor browser would, using the Tor driver, that comes with Selenium and scrapes all the information down. It's actually pretty easy to do that once you know where you wanna point your your Tor browser at to get at information. But obviously, you know, you can't sort of do this over a course of days. Right? You have to do this, like, okay, I've got I've got, like, a little bit of time to scrape all the information that I want.
[00:39:59] Unknown:
So does Mimics, for you, does it work like Google essentially, where you can query it and get results back?
[00:40:08] Unknown:
So you put in specific web addresses and then I scrape them and continue to scrape them for so I'm typically scraping not the deep web. Right? So Memex is sort of focused on the deep web things a little more than I am. But you have the capability of putting in deep websites and then pulling down information regarding that. Memex indexes the entire Internet. So they have the ability and the funding and the resources to do so, and so they do. But they also don't scrape everything. Right? So let's say we came across a website and 1 person is running it. That is not worthy of Memex's time to set up the indexing and go through all the instrumentation. Memex is a huge project, and it spans many, many research universities and many, many funding sources and many, many subcontractors.
And so being sort of like, hey, Memex. We need this thing. We need it tomorrow. You can't do that. Right? So that's why you have someone like me in house who can sort of be more flexible when they can't move forward. And also, I do a lot of design work for the Memex project. So they come to me for, like, suggestions on how to make the tool better, how to make it integrate more easily with the analysts. But the initial tool was developed by Chris White, who is amazing. And so so that was how it started. And now a lot of folks are working on this. It's it's become like a real multimillion dollar project. And so you can't be as flexible as they once were.
And that's that's why I'm useful.
[00:41:31] Unknown:
So what are some of the technical and legal challenges that you've experienced in the course of your work?
[00:41:36] Unknown:
Technical challenges usually involve really advanced machine learning things. I'm still not the best at image stuff. I'm actually working right now on a facial recognition platform and the ability to pull numbers out of images because oftentimes on these websites, there'll be a phone number across an image. But I think that using some tutorials on the MNIST dataset and just OpenCV and a few other things, with that challenge will be resolved within the next week. I actually was playing around with an initial solution that got pretty close to 94% accuracy before this call. So, hopefully, that will be done within the next week. But most of my technical challenges are around around sort of image processing things. I'm an NLP guy, so it's new ground for me. But it's fun. It's fun.
I would say, yeah, that that probably be my biggest technical challenges. Legal challenges are a little less present just because I work for lawyers, and so they kind of know what I'm allowed to do and what not allowed to do. And so they're sort of like, okay. You can work on this. You can't work on that. You can work on this. You can't work on that. And so there's a clear there's a clear sort of, like, division about where where my time should be spent, and I just don't work on things that will be, sort of not legally allowed or anything like that. And I have stumbled into this a couple of times. It's really tricky when you work for law enforcement what you can and cannot do, but I have very capable gifted bosses.
[00:43:02] Unknown:
So does your search engine try to infer who might be engaged in sex work voluntarily as opposed to those being forced into it against their will?
[00:43:09] Unknown:
So no. And that's because we get all of our case work from an honest invest from from, like, initial investigation. So you need to have either landed in the hospital or been the subject of domestic violence or been, like, referred to in some way in the initial investigation that indicates that you are not getting your full rights and, like, a physical or sexual abuse is happening to you. And so none of our cases we we benefit from the fact of of having, you know, human observation on our side. And the NYPD is very good about figuring out what is, like, legitimate or not legitimate, but, like, not coerced sex work versus an it's there's there's very clear signs.
And so when we start an investigation, we're reasonably sure something is going on. And we've also gotten very good at knowing what trafficking looks like just from, like, a point of fact sort of way. Now that's, of course, local to New York. Right? If you look nationally or internationally, trafficking looks very different. It's different in different cultures. There are different signs to look for. Some of them that are ones that are the same everywhere are, the girl was abused or the guy was abused physically. So there's some sort of, like, physical issue, and they usually are afraid to say why.
So, typically, if a girl is being physically abused and she's a prostitute, she will say that her boyfriend hit her, not her pimp. Because if she says her pimp hit her, then they know it's trafficking because that's pretty much the definition when you're being coerced into having sex against your will for money and then not being able to keep the money, well, that's the whole definition. But if you're forced into having sex against your will, that's for for any reason. That's that's pretty much either rape or sex trafficking. So that happens, and then we we have we have very clear patterns of indication, though, typically, that is sex trafficking. So, my tool does not do any semantic analysis around whether prostitution is legal or not legal. Also, it's very hard to do that, I should say.
I don't think you could pick up anything from an ad because no one's going to raise a flag and say, yeah, I'm traffic. I mean, you know, I'm sure that 1 could come up with a scheme if they had a full view of the Internet, and you knew everything that was happening on a real person's cell phone, but that again gets into the privacy debate, which is dangerous territory.
[00:45:28] Unknown:
You know, there there is I I know it may be the minority, but there are cases out there where people choose sex work, I mean, as a as a lucrative source of income. Whether it's right or wrong is another story altogether, but it definitely is a thing. So I'm glad to hear that the way that people come to your attention is makes it very clear that your services are are are needed.
[00:45:53] Unknown:
We we we wanna make sure that we're only helping people that want to be helped. If people want to be prostitutes for money, I don't know. I guess I say good for them. It's up to you. People are in trouble, then we wanna help them. That's sort of the end of it. So you mentioned that this looks very different regionally
[00:46:10] Unknown:
and that your what you're talking about is a a New York specific thing. I can understand, like, in different, you know, different cultures internationally. But, like, as just a for instance, how is sex trafficking different in Boston than in New York? I I ask because I'm I'm a Bostonian.
[00:46:29] Unknown:
So sex trafficking in Boston is probably very similar. The classification is not city to city. That's the classification is city, rural, or suburban. So sex trafficking in Long Island for instance or in, like a suburb of New Jersey would look very different than it does in Manhattan. Boston and New York, it would look very similar. So you could apply the metrics that we're doing and coming up with in other cities. And in fact, we intend on doing that. We intend on collaborating with other cities, taking in the learnings that we have, taking what they have all becoming better. And there are conferences around this already. So it's not like we haven't already started. It's just we intend to keep doing this. But for instance, in, fracking towns, in I believe it's it's either Montana or North Dakota or South Dakota. I can never remember which state exactly it's happening in. It might be the l 3, which is the reason I say that. So what's happening there is women are being, sort of shipped in by the busload, for these sort of, like, boom towns that are coming up, and not all of them are getting there legitimately. Some people are setting up sex trafficking shops in these sort of fracking towns. And so it looks very different in a place where there's a it's a fracking boomtown than in, like, you know, Canal Street in Manhattan.
So that's that's the difference. I think that you would probably end up with a very similar situation in Boston. I mean, like, you know, there's gonna be cultural norms that are different between New York and Boston, but more or less, it's going to be a young woman who's being physically abused, who says that her her her boyfriend is beating her when in fact it's her pimp beating her, and, you know, she sort of ends up in the hospital. And so we get a call, and it's like, oh, hey. Let's do something about this. So that that would be how that would go. But in a boom town, right, like, completely different set of infrastructure, there really isn't any. You have very different sort of population distributions. It's, like, 90% men and then, like, 10% women or 2 there may be even 2 women in the town. But you'll see, like, hundreds of ads for sex workers in if you go to, like, some of these places and that's sort of, like, what's going on here moment, that's somewhat of what the regional stuff is.
[00:48:37] Unknown:
So what was the most surprising fact that you uncovered as part of your research?
[00:48:41] Unknown:
Oh, the child sex trafficking websites, by far and away. The first time I saw 1, I was both disgusted and shocked. Actually, I was like a whole mix of emotions, but mostly I just wanted to put my fist through a wall and in fact every wall. Because when you see women when you see girls sexualized in that way, there is no other response than I need to hurt someone. There's just no way around it. It's so depressing and sad. So that was the most shocking. And seeing it not on the deep web was the probably the worst part about it. To being like, oh, wow. These guys are just they're just they're just on the regular Internet. They are comfortable being this. Okay.
So when do we go into their house and arrest them and stop this? Was sort of like like, can tomorrow? Can we can we go tomorrow? We we know the police. Let's go tomorrow. I'm buying everyone sandwiches, and we're gonna just take them down because due process is important. But, like, we're sure right now. So so that was the worst, thing.
[00:49:58] Unknown:
I can well imagine. I I I can only I I yeah. I mean, that's on the 1 hand, that's what makes what you do so incredible. On the on the other hand, that's part of what makes what you do so unique. I don't think I could do that job because I think I would have a very hard time remaining within the framework of the law if I encountered things like that. It must get tough.
[00:50:20] Unknown:
It is so hard. I mean, I do. I've never done anything bad or illegal, thankfully. I've always been able to sort of, like, recover. But, I mean, that day, I I just literally went outside to the park that's by my office and just started yelling as loud as I could because I didn't know what else to do. And then I went back in and finished writing the automation so that we could get these I'm sorry to curse, but there's no other way to call them. They that they that is their designation.
[00:50:49] Unknown:
I think we're good. I'm guessing that, that we don't have any minors listening to our podcast, but we can bleep you out if if that becomes necessary. But in any case, I very cool. So without revealing anything you shouldn't, are you aware of any anyone being set free as a result of your work?
[00:51:06] Unknown:
So I can speak to the fact that people have been set free as a result to our work because it's not my work. It's everyone's work. There's a team. It's very collaborative. There are investigators. There are lawyers. There are analysts. There's me. There's the larger district attorney's office. There's the police department. It's absolutely, and there's a there's a human anti human trafficking unit in the police department too, and they absolutely play a part in every single investigation that we take part in. The FBI helps us. Like, it's it's not I would never call it my work. I would call it our work, and I'm thankful every day that I have collaborators because 10 years ago, you know, it was just me thinking about, well, okay. So how do I bring down the biggest, oldest form of human sadness ever? And 2, like, having many, many people that I can call colleagues. So, yeah, I would say that we have set people free, and it honestly is what keeps me from going insane.
[00:52:06] Unknown:
So given the complicated socioeconomic aspects of human trafficking and prosecution of those who are responsible, can you discuss some of the moral and ethical considerations that you have been confronted with while building these tools?
[00:52:19] Unknown:
Yes. The most important moral and ethical considerations, and it is an extremely nuanced area, so I appreciate sort of stating that explicitly. Privacy is probably the biggest concern for me. Okay. I've just automated this process. Right? Another coder might be able to come in, and I try to make my code as as hard to change from use specificness as possible. I try not to build my tools in too general a way so that you could then take this and spy on everyone. But someone could take the methodologies that I'm applying here and do sort of a big brother y thing if they tried hard enough. But all that is documented really well on the Internet, so anyone in the government could probably do this anyway. But it's something that I do worry about. 1 day, having my tools being used by someone who has less than good intentions.
Yeah. I think though being a realist, no 1 really gets into the government unless they have good intentions. And if they don't have good intentions, they tend to leave rather quickly because, the pay is awful. Like, if you're not a good person, you don't stay for the power because the pay is sort of like you you leave very fast. So I'm only sort of minimally concerned about this.
[00:53:38] Unknown:
Do you think that's true of the government in general or of law enforcement in particular?
[00:53:42] Unknown:
I think it's true of local government at at any in any any sort of level. I don't know about the federal level. Right? So, obviously, you have all the things with NSA and the FBI and CIA. Company. Right. Right. Right. So with that level but I'm not building tools for them. They don't have access to my source. They have access to the open source things that I do, but those those tools exist in so many other formats and ways at this point, like ScraperWhipWiki, Google Image Search. Right? I'm not I'm not reinventing the wheel. I'm I'm not inventing the a better mouse trap for someone who wants to use this towards motion's ends.
I think that investigators could use it to, like, do something shady if they really, really wanted to, but there's so many other things available to them that I feel like my tools are sort of dropping the bucket when it gets to that level. Now, of course, now at the district attorney's office until me have the technical ability to sort of implement a lot of these tools, unfortunately. I'm trying to change that so hard. I'm actually working on a open source, book to teach my colleagues at the DA's office how to use Python in particular, but how to program generally.
So, and part of me coming on the show was I'm hoping that people will help me write this open source book and make it better. So this way, I can I I can teach folks in a lasting way so the day I leave isn't the because I'm not gonna do this forever? It's too depressing to do forever. I will probably end up either working for a big company or starting my own company that does, advocacy work, but in a less direct way at some point just because the work is so emotionally taxing. But, I'm hoping that, technological innovation will be something that I can bring to the local government sort of permanently so that they they have a way of saving time, getting correct results, and actually doing the due diligence to help the people that that matter.
[00:55:36] Unknown:
That's so great. It's really interesting to hear you say that because, I mean, I'm sure you you're you're aware of the whole civic hacking movement. Right? Like, Code for America and the various civic hacking groups that have formed in municipalities and cities and towns all across the nation. So it's really kind of neat to see that movement coming from both the outside in
[00:55:59] Unknown:
and from the inside out. Absolutely. And I'm part of, a collective of folks that are interested in this. So sometimes, I work out of a place called Civic Hall, which is sort of like a meeting ground in New York City for for folks that are interested in civic and social innovation. There are a whole whole host of folks inside and outside of government that are coming together now and having conversations that are important, talking about technology and how it can be used to better the lives of all of us. So I think it's an exciting time. You know, I think that there are a a lot of reasons to be hopeful in the broader scheme of things, and I think that we might start to see real social justice and change in the betterment of all, at the very least, in this country, in the major cities, I think urbanization is making us more thoughtful, sensitive, and hopefully capable individuals who can live free or happier lives.
[00:56:47] Unknown:
I think that's so important, and I'm so glad to see that this movement, as I say, happening on multiple levels because talking about technology and and how it can be used for both good and for ill, I feel like and not to necessarily disagree with your countermand your previous point about Tor, but I feel like from my perspective anyway, the right approach is not to try to restrict or ratchet down on, but to try to say, how can we do good to counter it? You know what I'm saying? Like, how can we encourage and promulgate the
[00:57:21] Unknown:
good? Sure. And that is my intention, with my earlier remarks about Tor. I don't think it should go away. I wanna be very clear on that. I think that we need to just build a better system, and that means sort of being more thoughtful and empathetic. And I think that that's something that we can apply to a lot of our technology. I think it's not limited to Tore at all. I think there are so many things that we can make more empathetic, more positive, and more capable of actually I mean, the way people are using Twitter right now, for instance, like I mentioned with that earlier example, I don't think that's something that people thought about about when they started, and and Twitter's been used in many instances for social democracy and good. I think there are so many places where technology like that can be powerful.
And and we just have to find more ways of doing this sort of thing, and then we'll be able to do something, meaningful that couldn't happen a 100 years ago, which is exciting.
[00:58:15] Unknown:
To paraphrase you a bit, I think that as engineers, we're all guilty of any time somebody hears the word build, they automatically jump to the idea of technology. But we don't necessarily need to build a better Tor system. We need to build a better social system around the use of Tor to improve the to improve the landscape and ecosystem around what is available on those networks. Ethical and moral aspects and outlooks of the people who are using it and empower them to have some capability of policing their networks within that overall.
[00:58:58] Unknown:
Yes. Definitely.
[00:59:01] Unknown:
So are there any projects out there that you consider similar to what you're working on?
[00:59:06] Unknown:
Yes. I would say there's Thorns, Spotlight, Tool. I would say the memex project, obviously. Yeah. Those are the the major ones that come to mind. So there's other people in the space that are collaborators. Polaris project runs a national hotline, which is wonderful. It's not a competing tool or something. And not that there's any real competition. It's like another tool out there, but it's certainly used to for anti trafficking endeavors. Also, do something dot org has What was that tool called? I'm sorry. Sorry. Polaris Project has a a national, hotline for folks that might be trafficked. You can call in or text in, and they will record your call and get you to local authorities.
So the Polaris Project is a nonprofit, I think, based out of DC, and they do a lot of wonderful work. Data scientists will work for them sort of on a volunteer basis. Also, DataKind, who are friends of mine, do some great anti trafficking work among other things. But I don't know if their endeavors are publicly available. DataKind tends to shy away from publicity because they're wonderful people and don't like the spotlight at all. Let me think if there's anyone else. Do something you started to say before I read the email. Right. Right. Right. So there's also do something dot org. So they are more broadly focused. They're, help center for teens. You can text in. Sometimes people will text them things like, I want to kill myself, and they will sort of respond to this. So it's like a general purpose crisis center, but I know they've had a few instances of of people saying, help me. I'm a slave or help me. I'm being raped by my dad or help me. I'm being trafficked even.
And so they'll do stuff to help those people in those situations get out of it. So these are some examples of platforms. Typically, it's either scraping or, you know, hotline type of thing or some image recognition, software. So that's that's just a short list. I'm sure there are more tools that I'm not aware of. I haven't talked to everyone in the space yet. Oh, there's also, there's also Rescue Forensics. They are trying to be data scientists that or they are data scientists that are working on, an early stage startup that that wants to do this as well. I gotta see a little bit of their tool a while back and it looked interesting. It shows a lot of promise.
[01:01:18] Unknown:
So what would it take for other municipalities and law enforcement agencies to get started with using your tools?
[01:01:25] Unknown:
Pretty much just, go to my GitHub account. It's, GitHub /capitale, Eric, capital s, s c h l e s. The 2 tools that are available right now are Alert System and Investigator. I sort of play on words because it's Investa and then underscore Gator, and then the picture the tool has a picture of of, alligator and a vest. And those 2 tools are open source, and you can start using them now. You can also contact me. My email address is ericschless@gmail.com. It's just my name@gmail.com. And reach out, and I'd be happy to collaborate with you, connect with my boss, and then we would start building stuff. I do work with some other cities, and I do volunteer with a number of nonprofits who are interested in anti trafficking things. So it's certainly something I like to do. I help too many people, hopefully. Hopefully, I help too many people.
[01:02:24] Unknown:
That's a good feeling to have, I should say, especially in your line of work. So how can our listeners get involved and help with this project?
[01:02:32] Unknown:
So So if you're interested in helping, just tweet at me, ericchlessor, capital e, capital s, or, you know, email me or volunteer for 1 of the many nonprofits that are interested in anti trafficking endeavors, or or you can volunteer at something having to do with with children and be a mentor for someone because underprivileged children are the most at risk to be trafficked. So if you make a difference in someone's life, they are much, much less likely to be trafficked in the future, especially at an early age.
[01:03:05] Unknown:
That's great. So is there anything that you feel like we should have asked you that didn't, or there's any message that you would care to give to people in the technical community that you feel like that you that you wanna say, basically, that you wanna get out there, what would it be?
[01:03:22] Unknown:
There is a world of good waiting to happen, and all we need to do is believe that we are capable of changing things. Literally, technology, writing code, it just means being able to be content creators. That's how we change opinions, points of view, is we get people to listen. Our jobs are literally to get eyeballs onto screens. We just have to sort of direct them to the right ideas, And we have the platform now like we've never had before. It used to be, you know, just TV folks. Now anyone can put up a website. Now anyone who has creativity and passion and design skills or drive can be heard, and we have a responsibility as the people driving who listens and who hears to make it a more equitable, inclusive, small, warm, and safe place to be alive.
So, yeah, that would be all I'd say.
[01:04:36] Unknown:
That's that's pretty good. Yeah. Thank you very much. Well, I think now it's time to go to the picks, So I'll get us started off. My first pick today is a Twitter account called Accidental Art, which is an account where people will post data visualizations that they have created accidentally while trying to create other data visualizations. So it's just a lot of really interesting and occasionally beautiful visualizations of data that went wrong and, just some of the interesting things that come out of that. And my next pick is a website called tldrlegal.com which is a site for being able to quickly gain information about and compare different open source licenses so that if you're creating a new project or you're evaluating a project and you want to know what some of the details of the license is without necessarily having to read the entire text of it, You can go there and it will just give you the CliffsNotes version. And you can also pick different licenses and it'll show you a compare and contrast between them. And my last pick is a band called Rishlu, which I started listening to recently.
That is kind of hard to classify the genre. But the best way I can describe it is that their sound is somewhat similar to TOOL. And if you like Tool's music, you'll probably like Richelieu's music as well. Chris, go ahead.
[01:06:04] Unknown:
Very cool. My first pick is this week, I'm not gonna pick a beer, which is shocking to anybody who's been listening to podcast for a while because I usually pick beer. But in this week, I'm gonna pick some different things. Oh, I lied. I have a beer at the end. Sorry. Neil Gaiman's Sandman Overture. I've been a huge fan of Neil Gaiman and his work. I love his comics. I love his his literature, and the Sandman series is just amazing. When I was in college, oh, at least many years ago because I'm old, our visual media design teacher said, go find a comic book series. Don't just walk into the store and get 1 because I'm telling you to take the time, browse, find something you like. And I realized that Minintight didn't really do anything for me, and I encountered Sandman, and it's this incredibly rich mythic story that in in fact is so rich and detailed that a bunch of experts from around the world started annotating his stories and, you know, with the actual mythic references that he was including in his work.
And it was just incredible. I have the entire original series in the hardcover. And last year, he has actually come out. He's after, like, a 15 year hiatus, he has started back again with a new set of stories in the Sandman universe and they are fantastic. Usually, so often at least, an artist takes a long break from something and comes back and it's like, yeah, maybe you should have left it alone. But these are phenomenal. The artwork is so gorgeous that I have to resist how you know, showing my wife, look, isn't that amazing? Because she's sick of me by this point. But I really feel like every other page that I turn, I think, you know, I'd love to see there's a poster on my wall. It's just it's that beautiful. So my next pick is a restaurant called Hen of the Wood, in Waterbury, Vermont. It is a unique place.
It's kind of a foodie destination, so you probably want to make a reservation well in advance. 1 of the cool things that makes it worth mentioning aside from the amazing food is that it it you can get a seat that is literally overlooking a waterfall, so you're getting spray from the waterfall while you have your meal. It's it's a pretty cool place. For my last pick, I am gonna pick a beer, despite what I said initially. It is the Alchemist Brewing Company's Hetty Topper. They're a brewing company in from Vermont, also from Waterbury.
They used to have a brew pub that unfortunately got taken out by a hurricane, so now they just brew beer. And Hittie Topper is delicious. It's also in the category of, hoppy beers that are really great, have a really great smooth finish and people drive for for 3 states or more to come get this beer. So it's it's definitely worth a try. And that's it for me. Eric, what picks do you have for us?
[01:08:54] Unknown:
So I have quite a few. 1 would be my friend James Powell's blog. Seriously, don't use this code.com. He is a dear friend and writes crazy, awesome, intense things, and I love all the crazy Python internals he talks about. That would be my first. My second pick would probably be, Julia Nunes. She's a relatively unknown recording artist, and she uses the ukulele and her exquisite vocal styles to make wonderful, like, thoughtful music. And I got to see her in concert a while back. She's great live. And then I would say 3rd and finally, been looking for this website. I'm not sure I'm spelling it right. Okay. Whatever. I'll just say the third is, an old an old, favorite, xkcd.com.
If you guys don't know xkcd, then you're missing out on 1 of the greatest web series comic things of all time. So hold standby. I think I read it every day. It's it's phenomenal and the only thing that makes me laugh consistently.
[01:10:01] Unknown:
Yeah. That's the xkcd is is in fact fantastic, and it's really funny. I just found myself reading I'm reading Master Emacs because I switched to Emacs recently or back to Emacs, I should say. In the book, they have a reference. They included the meta x butterflies. Xkcd, great 1. I mean, it's it's so clever. So great stuff.
[01:10:21] Unknown:
And for those of you who occasionally read xkcd and don't have a clue what he's companion website called explain xkcd.com which is a wiki of people explaining the various references that he includes in his comics.
[01:10:42] Unknown:
Well, that's awesome. I didn't know about that.
[01:10:46] Unknown:
Eric, we really appreciate you taking the time time to come on and talk to us about the incredible work that you're doing. And for anybody who would like to follow you and keep track of what you're up to? What's the best way for them to do that?
[01:11:01] Unknown:
So Twitter is pretty good. Email is fine if you want to contact me about things. Yeah, I would say actually, Twitter is probably the easiest the easiest way to contact me. I check my Twitter Twitter feed. Unfortunately, like, bidaily because, I like reading the news when I have nothing to do.
[01:11:23] Unknown:
You know, 1 question. I'm sorry, Tobias. I don't mean to break the script here. But 1 1 thing I've been wondering about and I I haven't been able to kind of figure out from web searching, I read a lot about Mimics in the news and whatnot. Is that something that's eventually gonna be open sourced? Is it is it open sourced? Do you guys have it? Open sourced. Yeah. It is open sourced, but we don't talk about it. We don't talk about it.
[01:11:43] Unknown:
So so people you can if you know it's kinda like, security by obscurity. If you know where to find MX, you you know where the code is. If you don't know where to find mimics, you don't know where the code is. Then people don't open source things until the last possible moment. This is sort of how DARPA works with all of its projects. It makes sure everything's open source, but it doesn't tell you how to find anything. So I, since I'm pilot member, know where all the source code is, I don't think I should or will share that information. But, yeah, if you're due diligent enough and you happen to know who all the MEMEX pilot partners are, then you can find their source code.
[01:12:22] Unknown:
Fair enough. Well, thank you very much. It's been an incredible conversation
[01:12:29] Unknown:
and very powerful. So thank you for taking the time. Thank you both for taking the time. This has been a wonderful opportunity. I always love talking about my work with thoughtful, interested, and genuinely wonderful humans like yourselves.
Introduction and Host Details
Guest Introduction: Eric Schless
Eric's Journey into Python
Python's Ecosystem and Utility
Comparing Python and R
Eric's Inspiration to Combat Human Trafficking
Understanding the Deep Web
Challenges and Ethics of the Deep Web
Balancing Privacy and Policing
Choosing Python for Anti-Trafficking Tools
Technical Details of the Search Engine
Technical and Legal Challenges
Identifying Trafficking Cases
Regional Differences in Trafficking
Surprising Discoveries in Research
Impact of the Work
Moral and Ethical Considerations
Similar Projects and Collaborations
Getting Started with the Tools
How Listeners Can Help
Final Thoughts and Messages
Picks and Recommendations