Calculate cosine similarities between queries and documents

In our previous post, we discussed about td-df vectors and how to calculate them. Until now, we learned what term frequency and inverse document frequency are, how they impact the relevancy of a document, and how we can calculate them. Now let’s learn how to calculate cosine similarities between queries and documents, and documents and documents.

If you go through the previous post, you will see that we calculate normalized tf-idf vectors. There’s a reason for that. We’ll come to it in a while.

(more…)

Read More

How to calculate tf-idf vectors

We discussed briefly about the vector space models and TF-IDF in our previous post. In short, TF (Term Frequency) means the number of times a term appears in a given document. IDF (Inverse Document Frequency) means number of documents in which the term appears at least once out of all the documents in the corpus (collection). In the case of IDF, the less documents a term appears in, the more relevant that term becomes.

Let’s work on a example to learn how to calculate tf-idf vectors.
(more…)

Read More

Vector Space Model (TF-IDF Weighting)

Vector_space_model

Brief Introduction

Vector space model or term vector model is an algebraic model for representing text documents (and any objects, in general) as vectors of identifiers. The representation of a set of documents as vectors in a common vector space is known as the vector space model. It is fundamental to a host of information retrieval operations ranging from scoring documents on a query, document classification and document clustering. It is used in information filtering, information retrieval, indexing and relevancy rankings.

Documents and queries are represented as vectors. Each dimension in the vectors corresponds to separate terms in the query. If the term in query appears in the document, then the corresponding value in the vector will be non-zero and zero if it doesn’t appear in the document. Among many different ways to calculate those values, tf-idf weighting is one of the most popular ones.

(more…)

Read More

How to download and delete file in Cloudant Nosql DB using python-flask?

title_image
If you have already read our article on how to upload a file in cloudant NoSql DB, then consider this the next part of that article where we will be talking about how to download and/or delete a file in cloudant using python. If you have not gone through our previous article, click here.

Let’s jump directly into establishing a route for downloading a file or attachment from the NoSQL database in cloudant.

Downloading

@app.route('/download', methods=['POST'])
def download():

(more…)

Read More

How to upload a file in Cloudant Nosql DB using python-flask?

In this article I am going to explain how to upload a file in Cloudant Nosql DB using python-flask. Furthermore, I am also going to explain how to delete, download and list the files in the Nosql database. If you already know how to store file in bluemix storage using python, then this might be a bit easier. If not, you can read the article here.

First of all, you install cloudant package for your python

pip install cloudant

Now you are ready to get into action.

(more…)

Read More

How to store file in bluemix storage using python

title_image


“Cloud” is a big name nowadays and rightfully so. It provides scalability, is economic (pay-per-use and reduced capital cost), flexible, globalization, reduced cost for technical infrastructures, and numerous other benefits.

One of the most common uses of “Clouds”, is shared or backup storage. A number of services provide free (limited) storage, and several provide an easy to use, comfortable interface, such as a folder (subdirectory) on your desktop   where you may drop file to be automatically backed up (to the cloud service and retrieved – or even shared between users). Several of these services: Dropbox, Sugarsync, Skydrive, Googledrive, and iCloud offer free storage. These services not only provide storage services but also an Application Program Interface (API) where you can program your way of using those services.

Among these services, one prominent name is IBM. IBM Bluemix is a cloud platform as a service (PaaS). It supports several programming languages like Java, Node.js, Go, PHP, Python, Ruby Sinatra, Ruby on Rails. In this article I am going to describe how python can be used to store/delete files in bluemix object storage.

(more…)

Read More

Two-finger scrolling : This is how you enable it in mint

two-finger scrolling

You buy a laptop, install linux mint or any cinnamon desktop, and you realise that the two-finger scrolling is not working. It can be quite frustrating at times. Especially when you have been using that feature for some time. But did you know, the feature in inbuilt in linux mint, just not activated by default?

It is very easy to make two-finger scrolling active.

Step 1:

Go to “Menu” on the bottom left of the window. Click on “Administration” or a gear-like icon.

(more…)

Read More

Youtube subscribers : This is how you can view who subscribed to your channel

You want to know who follows you, likes you and subscribes you, right? How did I know? Aw, I do the same too. Everyone does. This applies both in real world and in virtual world. But in this article, I am going to write specifically about youtube.

Youtube is one of the best if not the best platform to showcase your creativity. You can get hundreds of millions of potential viewers for your videos. As any other creator in the world, you also need an audience. Otherwise where’s the fun right?

But sometimes we want to know who and how many youtube subscribers are added to our channel. Especially during the beginning phase of our youtube-content-creator journey.

Many of us don’t realise but it is quite easy. You can easily see who subscribed to your channel and when by following these simple steps :

Step 1: Login to your youtube channel

phase1


(more…)

Read More

How to instantly know when your favourite person tweets

twitter_cover

Image Source : Andrew Mager – Flickr – link

Ever got fed up with twitter just storming your wall with tweets you do not care about?

We all feel that way. Even though we follow numerous accounts, we are biased towards some more than others. We want to see their tweets as soon as they post it. We don’t want to miss their tweets.

Did you know twitter has that facility available to all its users? I will be showing you a very simple way that enables you to never miss out on the tweets of your favourite person.

(more…)

Read More

How to send email using PHP via “sendmail” from localhost (XAMPP)

Xampp has sendmail included in its package. You can download the whole XAMPP package from this link. After You have installed the XAMPP into your windows. Sendmail itself cannot send emails. It needs some mail servers like gmail. Follow following steps :

Step 1:

Go to the XAMPP folder (wherever you have installed it). In my case, it is “C:/Xampp”. Inside that folder there must be a folder called “sendmail” if you have selected it to be installed while installing xampp. Go inside that folder and open the file “sendmail.ini” in a text editor.

(more…)

Read More