https://swcarpentry.github.io/python-novice-inflammation/instructor/02-numpy.html
numpy.loadtxt()
to load the inflammation dataset, then view it, store it in a variable and print it (9:25-9:30)type()
(an n-dimensional array) (9:20-9:22).shape
and use this as an example of an “attribute” of a python object (9:35-9:38)dir()
function (9:38-9:40)index various elements from the numpy array data[0,0]
and discuss that these start from the top left (9:40-9:43)
show what happens if you dont include bone of the bounds on the slice data[:3, 36:]
numpy.mean(data)
(9:48-9:52)maxval, minval, stdval = numpy.amax(data), numpy.amin(data), numpy.std(data)
patient_0 = data[0, :] # 0 on the first axis (rows), everything on the second (columns)
print('maximum inflammation for patient 0:', numpy.amax(patient_0))
print('maximum inflammation for patient 2:', numpy.amax(data[2, :]))
axis
parameter to apply a function to each row or column (9:58-10:00)print(numpy.mean(data, axis=0))
print(numpy.mean(data, axis=1))
https://swcarpentry.github.io/python-novice-inflammation/instructor/03-matplotlib.html
work thorugh SWC as written, but using the plotting conventions below:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(xdata, ydata)
plt.show()
plot-sxl.py
(i.e., “Who can tell me what this line is doing?”)#!/usr/bin/env python3
import numpy as np
import matplotlib.pyplot as plt
# Get dataset to recreate Fig 3B from Lott et al 2011 PLoS Biology https://pubmed.gov/21346796
# wget https://github.com/bxlab/cmdb-quantbio/raw/main/assignments/lab/bulk_RNA-seq/extra_data/all_annotated.csv
transcripts = np.loadtxt( "all_annotated.csv", delimiter=",", usecols=0, dtype="<U30", skiprows=1 )
print( "transcripts: ", transcripts[0:5] )
samples = np.loadtxt( "all_annotated.csv", delimiter=",", max_rows=1, dtype="<U30" )[2:]
print( "samples: ", samples[0:5] )
data = np.loadtxt( "all_annotated.csv", delimiter=",", dtype=np.float32, skiprows=1, usecols=range(2, len(samples) + 2) )
print( "data: ", data[0:5, 0:5] )
# Find row with transcript of interest
for i in range(len(transcripts)):
if transcripts[i] == 'FBtr0331261':
row = i
# Find columns with samples of interest
cols = []
for i in range(len(samples)):
if "female" in samples[i]:
cols.append(i)
# Subset data of interest
expression = data[row, cols]
# Prepare data
x = samples[cols]
y = expression
# Plot data
fig, ax = plt.subplots()
ax.set_title( "FBtr0331261" )
ax.plot( x, y )
fig.savefig( "FBtr0331261.png" )
plt.close( fig )