The post Basic Counting in Python appeared first on Sparrow Computing.

]]>```
things = [
"a",
"a", "b",
"a", "b", "c",
"a", "b", "c", "d",
]
```

With a list like this, you might care about a few different counts. What’s the count of all items? What’s the count of unique items? How many instances are there of `<some value>`

? How many instances are there of all unique values?

We can answer these questions easily and efficiently with lists, sets and dictionaries. Being very comfortable with these objects is important for writing good Python code. With that said, let’s find all our counts.

We’ll start with an easy one:

```
len(things)
# Expected result
# 10
```

The `len()`

function works for built-in Python data structures, but it also works with any class that implements the `__len__()`

method. For example, calling `len()`

on a NumPy array returns the size of the first dimension.

How many *unique* values are there in a list? Answer this question by first creating a unique collection of values (that is, a set). Then call `len()`

on the set:

```
len(set(things))
# Expected result
# 4
```

One thing to point out here is that `things`

doesn’t have to be a list of strings for this to work. In Python, you can put any hashable object into a set. By default, this includes simple data types, but you can implement the `__eq__()`

and `__hash__()`

methods that handle object equality and object hashes (respectively) in order to make any object hashable.

How many instances of `"a"`

are there in the list? You can find out with the `.count()`

method:

```
things.count("a")
# Expected result
# 4
```

Convenient!

OK, but what if we want to count the number of instances of all unique values? If you use Pandas or SQL, you will probably recognize this as a `group by`

operation. Indeed, Python comes with a `itertools.groupby()`

function that does exactly this. But it’s a bit of a pain because you have to sort your list before passing it in. And if you forget to sort your list, you don’t get an error, you just get the wrong result.

Instead, let’s go back to our trusty friend the set. If we loop through all the unique values (the set of values) then we can call the `.count()`

method with each one. That will tell us what we need to know:

```
for value in set(things):
print(value, things.count(value))
# Expected result
# a 4
# c 2
# b 3
# d 1
```

This is easy and efficient.

One other thing to mention is that if you want to know all of these counts for a list, you should consider creating a dictionary of value counts first. You can use a `collections.defaultdict`

for this, but you can also create it in a one-liner with dictionary comprehension:

```
counts = {value: things.count(value) for value in things}
counts
# Expected result
# {'a': 4, 'b': 3, 'c': 2, 'd': 1}
```

Now we have the count of all unique values. But you can also get all the other counts that we discussed above:

```
# Count all values in the list
sum(counts.values())
# Expected result
# 10
# Count unique values in the list
len(counts.keys())
# Expected result
# 4
# Count instances of a specific value
counts["a"]
# Expected result
# 4
```

The post Basic Counting in Python appeared first on Sparrow Computing.

]]>The post How to Use the PyTorch Sigmoid Operation appeared first on Sparrow Computing.

]]>`p(y == 1)`

.
Mathematically, the function is `1 / (1 + np.exp(-x))`

. And plotting it creates a well-known curve:

Similar to other activation functions like softmax, there are two patterns for applying the sigmoid activation function in PyTorch. Which one you choose will depend more on your style preferences than anything else.

The first way to apply the sigmoid operation is with the `torch.sigmoid()`

function:

```
import torch
torch.manual_seed(1)
x = torch.randn((3, 3, 3))
y = torch.sigmoid(x)
y.min(), y.max()
# Expected output
# (tensor(0.1667), tensor(0.9364))
```

There are a couple things to point out about this function. First, the operation works element-wise so `x`

can have any input dimension you want — the output dimension will be the same. Second, `torch.sigmoid()`

is functionally the same as `torch.nn.functional.sigmoid()`

, which was more common in older versions of PyTorch, but has been deprecated since the 1.0 release.

The second pattern you will sometimes see is instantiating the `torch.nn.Sigmoid()`

class and then using the callable object. This is more common in PyTorch model classes:

```
class MyModel(torch.nn.Module):
def __init__(self, input_dim):
super().__init__()
self.linear = torch.nn.Linear(input_dim, 1)
self.activation = torch.nn.Sigmoid()
def forward(self, x):
x = self.linear(x)
return self.activation(x)
torch.manual_seed(1)
model = MyModel(4)
x = torch.randn((10, 4))
y = model(x)
y.min(), y.max()
# Expected output
# (tensor(0.2182, grad_fn=<MinBackward1>),
# tensor(0.5587, grad_fn=<MaxBackward1>))
```

The output of this snippet shows that PyTorch is keeping track of gradients for us. But the function and class-based approaches are equivalent — and you’re free to call `torch.sigmoid()`

inside the `forward()`

method if you prefer. If you’re skeptical, then prove it to yourself by copying this snippet and replacing `self.activation(x)`

with `torch.sigmoid(x)`

.

And that’s all there is to know about the PyTorch sigmoid operation. Happy coding.

The post How to Use the PyTorch Sigmoid Operation appeared first on Sparrow Computing.

]]>The post PyTorch Tensor to NumPy Array and Back appeared first on Sparrow Computing.

]]>PyTorch is designed to be pretty compatible with NumPy. Because of this, converting a NumPy array to a PyTorch tensor is simple:

```
import torch
import numpy as np
x = np.eye(3)
torch.from_numpy(x)
# Expected result
# tensor([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]], dtype=torch.float64)
```

All you have to do is use the `torch.from_numpy()`

function.

Once the tensor is in PyTorch, you may want to change the data type:

```
x = np.eye(3)
torch.from_numpy(x).type(torch.float32)
# Expected result
# tensor([[1, 0, 0],
# [0, 1, 0],
# [0, 0, 1]])
```

All you have to do is call the `.type()`

method. Easy enough.

Or, you may want to send the tensor to a different device, like your GPU:

```
x = np.eye(3)
torch.from_numpy(x).to("cuda")
# Expected result
# tensor([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]], device='cuda:0', dtype=torch.float64)
```

The `.to()`

method sends a tensor to a different device. Note: the above only works if you’re running a version of PyTorch that was compiled with CUDA and have an Nvidia GPU on your machine. You can test whether that’s true with `torch.cuda.is_available()`

.

Going the other direction is slightly more involved because you will sometimes have to deal with two differences between a PyTorch tensor and a NumPy array:

- PyTorch can target different devices (like GPUs).
- PyTorch supports automatic differentiation.

In the simplest case, when you have a PyTorch tensor without gradients on a CPU, you can simply call the `.numpy()`

method:

```
x = torch.eye(3)
x.numpy()
# Expected result
# array([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]], dtype=float32)
```

But, if the tensor is part of a computation graph that requires a gradient (that is, if `x.requires_grad`

is true), you will need to call the `.detach()`

method:

```
x = torch.eye(3)
x.requires_grad = True
x.detach().numpy()
# Expected result
# array([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]], dtype=float32)
```

And if the tensor is on a device other than `"cpu"`

, you will need to bring it back to the CPU before you can call the `.numpy()`

method. We saw this above when sending a tensor to the GPU with `.to("cuda")`

. Now, we just go in reverse:

```
x = torch.eye(3)
x = x.to("cuda")
x.to("cpu").numpy()
# Expected result
# array([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]], dtype=float32)
```

Both the `.detach()`

method and the `.to("cpu")`

method are idempotent. So, if you want to, you can plan on calling them every time you want to convert a PyTorch tensor to a NumPy array, even when it’s not strictly necessary:

```
x = torch.eye(3)
x.detach().to("cpu").numpy()
# Expected result
# array([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]], dtype=float32)
```

By the way, if you want to perform image transforms on a NumPy array directly you can! All you need is to have a transform that accepts NumPy arrays as input. Check out my post on TorchVision transforms if you want to learn more.

The post PyTorch Tensor to NumPy Array and Back appeared first on Sparrow Computing.

]]>The post TorchVision Transforms: Image Preprocessing in PyTorch appeared first on Sparrow Computing.

]]>`torchvision.transforms`

module. The module contains a set of common, composable image transforms and gives you an easy way to write new custom transforms. As you would expect, these custom transforms can be included in your pre-processing pipeline like any other transform from the module.
Let’s start with a common use case, preparing PIL images for one of the pre-trained TorchVision image classifiers:

```
import io
import requests
import torchvision.transforms as T
from PIL import Image
resp = requests.get('https://sparrow.dev/assets/img/cat.jpg')
img = Image.open(io.BytesIO(resp.content))
preprocess = T.Compose([
T.Resize(256),
T.CenterCrop(224),
T.ToTensor(),
T.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
])
x = preprocess(img)
x.shape
# Expected result
# torch.Size([3, 224, 224])
```

Here, we apply the following in order:

- Resize a PIL image to
`(<height>, 256)`

, where`<height>`

is the value that maintains the aspect ratio of the input image. - Crop the
`(224, 224)`

center pixels. - Convert the PIL image to a PyTorch tensor (which also moves the channel dimension to the beginning).
- Normalize the image by subtracting a known ImageNet mean and standard deviation.

Let’s go a notch deeper to understand exactly how these transforms work.

TorchVision transforms are extremely flexible – there are just a few rules. In order to be composable, transforms need to be callables. That means you can actually just use lambdas if you want:

```
times_2_plus_1 = T.Compose([
lambda x: x * 2,
lambda x: x + 1,
])
x.mean(), times_2_plus_1(x).mean()
# Expected result
# (tensor(1.2491), tensor(3.4982))
```

But often, you’ll want to use callable classes because they give you a nice way to parameterize the transform at initialization. For example, if you know you want to resize images to have height of `256`

you can instantiate the `T.Resize`

transform with a `256`

as input to the constructor:

`resize_callable = T.Resize(256)`

Any PIL image passed to `resize_callable()`

will now get resized to `(<height>, 256)`

:

```
resize_callable(img).size
# Expected result
# (385, 256)
```

This behavior is important because you will typically want TorchVision or PyTorch to be responsible for calling the transform on an input. We actually saw this in the first example: the component transforms (`Resize`

, `CenterCrop`

, `ToTensor`

, and `Normalize`

) were chained and called inside the `Compose`

transform. And the calling code would not have knowledge of things like the size of the output image you want or the mean and standard deviation for normalization.

Interestingly, there is no `Transform`

base class. Some transforms have no parent class at all and some inherit from `torch.nn.Module`

. This means that if you’re writing a transform class, the constructor can do whatever you want. The only requirement is that there must be a `__call__()`

method to ensure the instantiated object is callable. Note: when transforms override the `torch.nn.Module`

class, they will typically define the `forward()`

method and then the base class takes care of `__call__()`

.

Additionally, there are no real constraints on the callable’s inputs or outputs. A few examples:

`T.Resize`

: PIL image in, PIL image out.`T.ToTensor`

: PIL image in, PyTorch tensor out.`T.Normalize`

: PyTorch tensor in, PyTorch tensor out.

NumPy arrays may also be a good choice sometimes.

Ok. Now that we know a little about what transforms are, let’s look at an example that TorchVision gives us out of the box.

The `T.Compose`

transform takes a list of other transforms in the constructor and applies them sequentially to the input. We can take a look at the `__init__()`

and `__call__()`

methods from a recent commit hash to see how this works:

```
class Compose:
def __init__(self, transforms):
self.transforms = transforms
def __call__(self, img):
for t in self.transforms:
img = t(img)
return img
```

Very simple! You can pass the `T.Compose`

constructor a list (or any other in-memory sequence) of callables and it will dutifully apply them to any input one at a time. And notice that the input `img`

can be any type you want. In the first example, the input was `PIL`

and the output was a PyTorch tensor. In the second example, the input and output were both tensors. `T.Compose`

doesn’t care!

Let’s instantiate a new `T.Compose`

transform that will let us visualize PyTorch tensors. Remember, we took a PIL image and generated a PyTorch tensor that’s ready for inference in a TorchVision classifier. Let’s take a PyTorch tensor from that transformation and convert it into an RGB NumPy array that we can plot with Matplotlib:

```
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
reverse_preprocess = T.Compose([
T.ToPILImage(),
np.array,
])
plt.imshow(reverse_preprocess(x));
```

The `T.ToPILImage`

transform converts the PyTorch tensor to a PIL image with the channel dimension at the end and scales the pixel values up to `int8`

. Then, since we can pass any callable into `T.Compose`

, we pass in the `np.array()`

constructor to convert the `PIL`

image to NumPy. Not too bad!

As we’ve now seen, not all TorchVision transforms are callable classes. In fact, TorchVision comes with a bunch of nice functional transforms that you’re free to use. If you look at the `torchvision.transforms`

code, you’ll see that almost all of the real work is being passed off to functional transforms.

For example, here’s the functional version of the resize logic we’ve already seen:

```
import torchvision.transforms.functional as F
F.resize(img, 256).size
# Expected result
# (385, 256)
```

It does the same work, but you have to pass additional arguments in when you call it. My advice: use functional transforms for writing custom transform classes, but in your pre-processing logic, use callable classes or single-argument functions that you can compose.

At this point, we know enough about TorchVision transforms to write one of our own.

Let’s write a custom transform that erases the top left corner of an image with the color of a randomly selected pixel. We’ll use the `F.erase()`

function and we’ll allow the caller to specify what how many pixels they want to erase in both directions:

```
import torch
class TopLeftCornerErase:
def __init__(self, n_pixels: int):
self.n_pixels = n_pixels
def __call__(self, img: torch.Tensor) -> torch.Tensor:
all_pixels = img.reshape(3, -1).transpose(1, 0)
idx = torch.randint(len(all_pixels), (1,))[0]
random_pixel = all_pixels[idx][:, None, None]
return F.erase(img, 0, 0, self.n_pixels, self.n_pixels, random_pixel)
```

In the constructor, all we do is take the number of pixels as a parameter from the caller. The magic happens in the `__call__()`

method:

- Create a reshaped view of the image tensor as a
`(n_pixels, 3)`

tensor - Randomly select a pixel index using
`torch.randint()`

- Add two dummy dimensions to the tensor. This is because
`F.erase()`

and to the image, which has these two dimensions. - Call and return
`F.erase()`

, which takes five arguments: the tensor, the`i`

coordinate to start at, the`j`

coordinate to start at, the`height`

of the box to erase, the`width`

of the box to erase and the random pixel.

We can apply this custom transform just like any other transform. Let’s use `T.Compose`

to both apply this erase transform and then convert it to NumPy for plotting:

```
torch.manual_seed(1)
erase = T.Compose([
TopLeftCornerErase(100),
reverse_preprocess,
])
plt.imshow(erase(x));
```

We’ve seen this type of transform composition multiple times now. One thing that is important to point out is that you need to call `torch.manual_seed()`

if you want a deterministic (and therefore reproducible) result for any TorchVision transform that has random behavior in it. This is new as of version `0.8.0`

.

And that’s about all there is to know about TorchVision transforms! They’re lightweight and flexible, but using them will make your image preprocessing code much easier to reason about.

The post TorchVision Transforms: Image Preprocessing in PyTorch appeared first on Sparrow Computing.

]]>The post NumPy Where: Understanding np.where() appeared first on Sparrow Computing.

]]>`where()`

function is like a vectorized switch that you can use to combine two arrays. For example, let’s say you have an array with some data called and you want to create a new array with 1 whenever an element in the data array is more than one standard deviation from the mean and -1 for all other elements.
This is a perfect use case for `np.where()`

. First, create a boolean array for your conditional, and then use call `np.where()`

:

```
import numpy as np
import pandas as pd
df = pd.read_csv("https://jbencook.s3.amazonaws.com/data/dummy-sales.csv")
condition = np.abs(df.revenue - df.revenue.mean()) > df.revenue.std()
np.where(condition, 1, -1)
# Expected result
# array([ 1, -1, -1, 1, -1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1,
# -1, -1, 1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1])
```

The arguments here are:

`condition`

: a NumPy array of elements that evaluate to True or False`x`

: an optional array-like result for elements that evaluate to True`y`

: an optional array-like result for elements that evaluate to False

The elements of `condition`

don’t actually need to have a boolean type as long as they can be coerced to a boolean (e.g. non-zero integers are interpreted as True). Also, both `x`

and `y`

are optional, but if you provide one, you need to provide both. Additionally, the input arrays can have any shape so you can use this as a multi-dimensional switch.

One thing to watch out for: the return value takes a different form if you don’t supply `x`

and `y`

. In that case, `np.where()`

returns the indices of the true elements (for a 1-D vector) and the indices for all axes where the elements are true for higher dimensional cases. This is equivalent to `np.argwhere()`

except that the index arrays are split by axis.

You can see how this works by calling `np.stack()`

on the result of `np.where()`

:

```
x = np.eye(4)
np.stack(np.where(x), -1) == np.argwhere(x)
# Expected result
# array([[ True, True],
# [ True, True],
# [ True, True],
# [ True, True]])
```

This makes `np.where()`

without the `x`

and `y`

inputs equivalent to calling the `.nonzero()`

method on the condition array:

```
np.stack(x.nonzero(), -1) == np.argwhere(x)
# Expected result
# array([[ True, True],
# [ True, True],
# [ True, True],
# [ True, True]])
```

**Multi-dimensional binary cross entropy**

Now that we know how the API works, let’s look at another example: multi-dimensional binary cross entropy. Say we have a 3-D array of binary class probabilities `yhat`

and a 3-D array of binary labels `y`

. The one-liner formula for binary cross-entropy is the following:

`-(y * np.log(yhat) + (1 - y) * np.log(1 - yhat)).mean()`

This does work in the multi-dimensional case because NumPy defaults to element-wise operations. The multiplication of `y`

and `1 - y`

times the log terms function like switches. When `y == 1`

the first term is included and when `y == 0`

the second term is included:

```
np.random.seed(1)
yhat = np.random.uniform(size=(3, 3, 3))
y = np.random.randint(0, 2, size=(3, 3, 3))
-(y * np.log(yhat) + (1 - y) * np.log(1 - yhat)).mean()
# Expected result
# 1.221865004504288
```

But we can accomplish the same thing with `np.where()`

:

```
-np.where(y, np.log(yhat), np.log(1 - yhat)).mean()
# Expected result
# 1.221865004504288
```

Pretty cool! This is not necessarily a better implementation in any important way, but it does make the function of the `y`

and `1 - y`

terms very clear.

The post NumPy Where: Understanding np.where() appeared first on Sparrow Computing.

]]>The post Finding the Mode of an Empirical Continuous Distribution appeared first on Sparrow Computing.

]]>`f`

that generates samples from an unknown continuous probability distribution:
```
def f(n: int) -> np.ndarray:
"""Generate n samples from an unknown distribution"""
samples = # ??? We don't know ???
return samples
```

The sample generating logic could be literally unknown (as above), or it could be some complex thing that we do know but can’t easily model with a parameterized distribution. You still might want to know the mode.

In case you’re thinking “that’s obvious”, let me run through a couple solutions that don’t work.

- You can’t count occurrences of values found in
`samples`

. That would work for a discrete distribution with a known set of possible values. But you can’t expect to find more than one occurrence of any of the values in`samples`

. Mathematically, you will never get the same value twice. On a computer you will eventually get some samples twice because of finite precision, but that won’t be a good estimate of the mode of the distribution. - You can’t estimate it with the mean or median. The distribution might not be symmetric! And it might be multi-modal! If you can reasonably fit a named distribution to your samples, then the problem becomes easier: estimate the distribution parameters and look up the mode online!

Ok, so the solution isn’t obvious (at least it wasn’t obvious to me), but that doesn’t mean it’s complex. If you can generate (or already have) samples, you should start by plotting the distribution with a histogram.

Let’s write an `f`

function so we can take a look:

```
import numpy as np
import pandas as pd
def f(n: int) -> np.ndarray:
"""Generate n samples from an unknown distribution"""
df = pd.read_csv("https://jbencook.s3.amazonaws.com/data/dummy-sales-large.csv")
np.random.seed(1)
samples = np.random.choice(df.revenue, n)
return samples
```

Now plot the distribution:

```
import seaborn as sns
sns.histplot(f(1000), bins=20);
```

Just by looking at the plot, we can tell that the mode of this distribution is slightly less than 200. So how do you find the mode of an empirical distribution? Well, one good method is visual inspection! Just plot the histogram and take a look.

Ok, but what if you need to find the mode programmatically? This can be accomplished with the `np.histogram()`

function from NumPy:

```
counts, bins = np.histogram(x, bins=20)
max_bin = np.argmax(counts)
bins[max_bin:max_bin+2].mean()
# Expected result
# 179.3633778773795
```

As expected: slightly less than 200! The `np.histogram()`

function returns two arrays, one with the number of instances per bin and one with the edges of the bins. That means we can estimate the location of the maximum height of the distribution by getting the index of the maximum count and then taking the midpoint of that bin.

The big factor in determining how precise your estimate will be is your bin size, which you can think of as a hyperparameter for this procedure. I don’t have a great rule of thumb for picking this. What I do is plot the histogram and mess with number of bins until I achieve a “reasonable amount of smoothness”. But generally speaking, the more data you have, the more bins you can use and the more precise your estimate will be.

The mode is sometimes a good pick for a “typical” value to summarize a distribution. Remember the mean is susceptible to outliers and the median will tend to have low probability density if the distribution is multi-modal. In the example we’re using the mean is around 482 and the median is around 359. Both of these are actually relatively low probability values so it might be important to estimate the mode!

The post Finding the Mode of an Empirical Continuous Distribution appeared first on Sparrow Computing.

]]>The post NumPy All: Understanding np.all() appeared first on Sparrow Computing.

]]>`np.all()`

function tests whether all elements in a NumPy array evaluate to true:
```
np.all(np.array([[1, 1], [1, 1]]))
# Expected result
# True
```

Notice the input can have arbitrary shape and the data type does not have to be boolean (it just has to be truthy). If any of the elements don’t evaluate to true, the function returns false:

```
np.all(np.array([[0, 1], [1, 1]]))
# Expected result
# False
```

We can also use the optional `axis`

argument to make `np.all()`

a reducing operation. Say we want to know which rows in a matrix have elements that all evaluate to true. We can do that by passing in `axis=-1`

:

```
np.all(np.ones((2, 3)), axis=-1)
# Expected result
# array([ True, True])
```

In the above example, there are two rows and for each of them, all elements evaluate to true. The `-1`

value here is shorthand for “the last axis”.

And that’s it! NumPy also has a function called `np.any()`

which has the same API as `np.all()`

but returns true when *any* of the elements evaluate to true.

The post NumPy All: Understanding np.all() appeared first on Sparrow Computing.

]]>The post Binary Cross Entropy Explained appeared first on Sparrow Computing.

]]>```
def binary_cross_entropy(yhat: np.ndarray, y: np.ndarray) -> float:
"""Compute binary cross-entropy loss for a vector of predictions
Parameters
----------
yhat
An array with len(yhat) predictions between [0, 1]
y
An array with len(y) labels where each is one of {0, 1}
"""
return -(y * np.log(yhat) + (1 - y) * np.log(1 - yhat)).mean()
```

Good question! The motivation for this loss function comes from information theory. We’re trying to minimize the difference between the `y`

and `yhat`

distributions. That is, we want to minimize the difference between ground truth labels and model predictions. This is an elegant solution for training machine learning models, but the intuition is even simpler than that.

Binary classifiers, such as logistic regression, predict yes/no target variables that are typically encoded as 1 (for yes) or 0 (for no). When the model produces a floating point number between 0 and 1 (`yhat`

in the function above), you can often interpret that as `p(y == 1)`

or the probability that the true answer for that record is “yes”. The data you use to train the algorithm will have labels that are either 0 or 1 (`y`

in the function above), since the answer for each record in your training data is known.

To train a good model, you want to penalize predictions that are far away from their ground truth values. That means you want to penalize values close to 0 when the label is 1 and you want to penalize values close to 1 when the label is 0.

The `y`

and `(1 - y)`

terms act like switches so that `np.log(yhat)`

is added when the true answer is “yes” and `np.log(1 - yhat)`

is added when the true answer is “no”. That would move the loss in the opposite direction that we want (since, for example, `np.log(yhat)`

is larger when `yhat`

is closer to 1 than 0) so we take the negative of the sum instead of the sum itself.

Here’s a plot with the first and second log terms (respectively) when they’re switched on:

Notice the log function increasingly penalizes values as they approach the wrong end of the range.

A couple other things to watch out for:

- Since we’re taking
`np.log(yhat)`

and`np.log(1 - yhat)`

, we can’t use a model that predicts 0 or 1 for`yhat`

. This is because`np.log(0)`

is`-inf`

. For this reason, we typically apply the sigmoid activation function to raw model outputs. This allows values to get close to 0 or 1, but never actually reach the extremes of the range. - We typically divide by the number of records so the value is normalized and comparable across datasets with different sizes. This is the purpose of the
`.mean()`

method call in the implementation above.

Of course, you probably don’t need to implement binary cross entropy yourself. The loss function comes out of the box in PyTorch and TensorFlow. When you use the loss function in these deep learning frameworks, you get automatic differentiation so you can easily learn weights that minimize the loss. You can also use the same loss function in scikit-learn.

The post Binary Cross Entropy Explained appeared first on Sparrow Computing.

]]>The post Filtering DataFrames with the .query() Method in Pandas appeared first on Sparrow Computing.

]]>`.query()`

method on DataFrames with a convenient string syntax for filtering. Think of the `.query()`

syntax like the `where`

clause in SQL.
Here’s a basic example:

```
import pandas as pd
df = pd.read_csv("https://jbencook.s3.amazonaws.com/data/dummy-sales.csv")
df.query("region == 'APAC' and revenue < 300")
# Expected result
# date region revenue
# 3 1999-01-06 APAC 135
# 9 1999-01-18 APAC 147
# 11 1999-01-24 APAC 100
# 24 1999-03-20 APAC 108
```

The query string `"region == 'APAC' and revenue < 300"`

selects the rows where `region`

is `'APAC'`

and `revenue`

is less than 300.

Pretty simple! You can also reference local variables by prefixing them with `@`

. If we wanted to get examples where `revenue`

was 1 standard deviation above the mean, we could first compute these values and then reference them in our query string:

```
avg_revenue = df.revenue.mean()
std_revenue = df.revenue.std()
df.query("revenue > @avg_revenue + @std_revenue")
# Expected result
# date region revenue
# 0 1999-01-02 APAC 928
# 8 1999-01-16 APAC 970
# 19 1999-02-16 EMEA 918
# 25 1999-03-23 AMER 972
# 26 1999-03-24 AMER 956
# 27 1999-03-24 EMEA 954
# 29 1999-03-26 AMER 994
```

But you can also call methods on the columns inside the string. Here’s the same query without pre-computing the mean and standard deviation:

```
df.query("revenue > revenue.mean() + revenue.std()")
# Expected result
# date region revenue
# 0 1999-01-02 APAC 928
# 8 1999-01-16 APAC 970
# 19 1999-02-16 EMEA 918
# 25 1999-03-23 AMER 972
# 26 1999-03-24 AMER 956
# 27 1999-03-24 EMEA 954
# 29 1999-03-26 AMER 994
```

It’s worth mentioning one other cool trick. You can check whether a column value is in a local list:

```
valid_dates = ["1999-01-02", "1999-01-03", "1999-01-04"]
df.query("date in @valid_dates")
# Expected result
# date region revenue
# 0 1999-01-02 APAC 928
# 1 1999-01-03 AMER 526
# 2 1999-01-04 EMEA 497
```

A few other things to be aware of:

- You can’t reference columns if they share a name with Python keywords.
- You can use backticks, e.g.
`hello world`

to reference a columns that aren’t valid Python variables. - The result is a new DataFrame, unless you pass
`inplace=True`

, in which case it modifies the existing DataFrame. - Performance of
`.query()`

will often be better than complex masking operations (such as`df[(df.region == "APAC") & (df.revenue < 300)]`

), because`.query()`

doesn’t create intermediate objects, leaving everything in C.

Check out my Jupyter notebook if you want to play around with the `.query()`

method!

The post Filtering DataFrames with the .query() Method in Pandas appeared first on Sparrow Computing.

]]>The post Linear Interpolation in Python: An np.interp() Example appeared first on Sparrow Computing.

]]>`np.interp()`

function from NumPy:
```
import numpy as np
points = [-2, -1, 0, 1, 2]
values = [4, 1, 0, 1, 4]
x = np.linspace(-2, 2, num=10)
y = np.interp(x, points, values)
```

Notice that you have to pass in:

- A set of points where you want the interpolated value (
`x`

) - A set of points with a known value (
`points`

) - The set of known values (
`values`

)

Let’s plot the known points in blue and the interpolated points in orange so we can see what’s happening:

```
import matplotlib.pyplot as plt
plt.plot(points, values, 'o')
plt.plot(x, y, 'o', alpha=0.5)
plt.xlabel("x")
plt.ylabel("y");
```

Easy enough.

The post Linear Interpolation in Python: An np.interp() Example appeared first on Sparrow Computing.

]]>