Thursday, January 28, 2016
Monday, January 18, 2016
.NET
Compiler Platform "Roslyn"
At PicScout, we use .NET Compiler Platform, better known by its
codename "Roslyn",
an open source compiler and code analysis API for C#.
an open source compiler and code analysis API for C#.
The compilers are available via the traditional
command-line programs and also as APIs, which are available natively from
within .NET code.
Roslyn exposes modules for syntactic (lexical) analysis of code, semantic analysis, dynamic compilation
to CIL, and code emission.
Recently, I used Roslyn for updating hundreds of files to
support an addition to our logging framework.
I needed to add a new member in each class that was using
the logger and modify all the calls to the logger, of which there were several
hundreds.
We came up with two ways of handling this challenge.
After reading a class into an object representing it, one
possible way of adding a member is to supply a syntactical
definition of a new member, and then re-generate back the source code for the
class.
The problem with this approach was the relative difficulty
of configuring a new member correctly.
Here is how it might look:
Generating the code:
private readonly OtherClass _b = new OtherClass(1,
"abc");
Another option, which is more direct, was to simply get the properties of
the class and use them.
For example, we know where the class definition ends and we
can append a new line containing the member definition.
Here is how it looks:
Get class details:
Insert the new line (new member):
After that, replacing the calls to the new logger is a simple
matter of search - replace.
Wednesday, January 6, 2016
PhantomJS
The Problem:
When given a website and an image on that website, the task is to take a screen capture of that image on the page it appears on the site.
Solution 1:
Manually: enter the website, find the given image (scroll if needed), take the screenshot and save it to the disk.
But what can you do when you have thousands of screenshots to take per hour?
You can employ hundreds of people to handle this scale, but...
This is kind of expensive and what should you do if your scale increases or decreases?
Solution 2:
Automate it: if only we could write a piece of software that could do exactly what we need...
So what do we actually need? Something that can:
1) Imitate a browser
2) Find an image on a webpage
3) Take the screenshot
Let me introduce PhantomJS:
PhantomJS is commonly known as Headless Web Kit with JavaScript API.
Headless refers to the fact that the program can be run from the command line without a window system.
JavaScript API means that we can easily write scripts that interact with PhantomJS which is useful if one needs to find an image on a webpage for instance.
Web Kit is the open-source web browsing engine that powers popular browsers like Chrome.
How to use it?
There are many ways to use PhantomJS, here at PicScout we use Selenium Web Driver to run PhantomJS. Selenium can control PhantomJS in the same way that it does any other browser.
How does it help me to take a screen capture?
As we said before, PhantomJS can run JavaScript, so all we have left to do is to write a short script that searches for the image location on the page and let PhantomJS run the script.
After receiving the location we can use PhantomJS to take a screen capture, despite the fact that PhantomJS is a Headless browser it still can render a web page as well as a web driver.
Code sample:
Running several instances of PhantomJS – problems and solutions
Problem #1: zombie processes. In our app we create and kill PhantomJS processes, we have noticed that after some time there are many zombie instances of PhantomJS.
Solution #1: for unknown reasons, occasionally we are unable to create a new PhantomJS instance. This happen when an exception is thrown and a new PhantomJS process starts. Now we need to manually find the process id and kill it.
Problem #2: low success rate on high CPU usage - when CPU reached 100% we were receiving a lot of errors from PhantomJS.
Solution #2: number of PhantomJS instances should be set according to 'computing power'. Notice that most of the time PhantomJS won't consume much CPU but there are websites for which this isn't the case, you should take this into consideration when you decide how many PhantomJS processes you would like to run.
Problem #3: sometimes the screen capture fails without any apparent reason.
Solution #3: we were able to increase the screen capture success rate by using a retry mechanism.
Thursday, December 3, 2015
Why you should use reCAPTCHA in public websites?
The
problem:
We've
developed an API which allows users to search and upload Images.
Any application that wants to query it uses an API key which allows it to perform different actions according to its permissions.
Any application that wants to query it uses an API key which allows it to perform different actions according to its permissions.
Recently we started to expose some of the API's abilities
in public websites.
For example, see the PicScout Search Tool on www.picscout.com (and press the "Launch Tool" button).
Here’s the issue: Exposing the key to unknown users can make us vulnerable to spam and abuse.
For example, see the PicScout Search Tool on www.picscout.com (and press the "Launch Tool" button).
Here’s the issue: Exposing the key to unknown users can make us vulnerable to spam and abuse.
The
solution:
In
order to overcome this problem we decided to use Google reCaptcha.
Using this tool means that only real people can pass
through the system, as opposed to malicious bots.
Reaching this solution included client and server side adaptations. On the client side, we
added support to the reCaptcha
widget. This widget is shown to the users before their first action in the
site and afterwards only if their token has expired. On the server side, we
added a second layer of authentication. This authentication is enforced only on
API keys that are public, meaning those used on public sites. When making a request, the users must send an API key as
well as a token supplied to them by Google reCaptcha. The server verifies this token combined with some secret agreed between the server and Google. If this information is successfully verified, the resource is returned to the user. Otherwise, the request
fails.
That's about it on how we use reCAPTCHA at PicScout.
Monday, November 16, 2015
Riddle me this
We recently published some code riddles,
It was really fun writing them and we had a lot of good responses from Software developers who enjoyed solving them.
The solvers of the riddles got a nice T-Shirt:
For those of you who enjoy riddles here they are:
- Follow the bread crumb trail from here:
- Go to http://riddle.guru
Enjoy :-)
Thursday, June 4, 2015
Managing dependent jobs in Jenkins
The problem
It is well known that Jenkins can handle job dependencies based on maven dependencies.
But, how can we manage dependencies between Jenkins jobs that are based on .Net code?
Of course, we can manage job dependencies manually, marking in each job what are its dependent jobs, this takes a lot of time to maintain and is also error prone.
Of course, we can manage job dependencies manually, marking in each job what are its dependent jobs, this takes a lot of time to maintain and is also error prone.
We looked into some options (there weren't many) and couldn't find anything that quite fitted our needs.
This is because here in Picscout, each of our Jenkins jobs is mapped to a single solution.
It can run MSBuild , or unit tests on each relevant project in the solution.
If, for example, we take NDepend powerful API and try to adjust it to our needs, we can use it to know what assemblies are referenced by our solution's projects.
But what are the solutions that hold those assemblies? and by what order should we build them? this is for left for us to implement.
Surely there must be a better alternative.
This is because here in Picscout, each of our Jenkins jobs is mapped to a single solution.
It can run MSBuild , or unit tests on each relevant project in the solution.
If, for example, we take NDepend powerful API and try to adjust it to our needs, we can use it to know what assemblies are referenced by our solution's projects.
But what are the solutions that hold those assemblies? and by what order should we build them? this is for left for us to implement.
Surely there must be a better alternative.
The solution
We decided to write something of our own.
We wanted a tool that will integrate well with Jenkins.
We wanted a tool that will integrate well with Jenkins.
While Jenkins can run any type of script or executable, if you run your code within Jenkins itself (using groovy system script) you will be able to use its internal API, to do things like start new jobs, wait for them to complete and analyze their build result quite effortlessly.
This however means you have to write your code on a JVM based language-
This however means you have to write your code on a JVM based language-
That is why we opted to write our tool in Java.
The dependency tool we wrote can map solution and project files in given folders and build a dependency chain for each project and solution.
Given a project, it can tell you what projects (by order) will be affected as a result of a change in this project and same is true for solutions.
Now, all that is left to do is to integrate it with Jenkins and Git.
Git/Jenkins Integration
The process we built is made of a few steps:
1. We use git-hooks to identify changed folders, we than find what solutions are in those folders.
2. We send the list of changed solutions to a special job we wrote in Jenkins.
3. The job runs a small Groovy system script , this script finds the dependent jobs using the dependency tool, runs them and and decides if the git commit can be accepted to our master branch (if all the jobs passed).
finally, the dependency tool can also be integrated with Jenkins dependency grpah view plugin to draw the dependencies :
(we are still working on this one)
Source/Deployment
The dependency tool can be found here along with some example usage projects and groovy scripts.
The tool is also deployed in Maven repo.
Please notice it is licensed under the MIT license.
Sunday, November 23, 2014
PicScout is Hiring !!!
Hi all,
PicScout is looking for top notch SW engineers that would like to join an extremely innovative team.
If you consider yourself as one that can fit the team, please solve the below quiz and if your solution is well written - expect a phone call from us. You can choose any programming language you like to write the code.
Don't forget to also attach your CV along with the solution.
Candidates that will finally be hired will win a 3K NIS prize.
So here it is:
We want to
build a tic–tac–toe server.
The server will enable two users to log in. It will create a match between them and will let them play until the match is over.
The server will enable two users to log in. It will create a match between them and will let them play until the match is over.
The clients
can run on any UI framework or even as console applications.(We want to put our main focus on the server for this quiz). Extra credit is given for: good design, test coverage, clean code.
Subscribe to:
Posts (Atom)