Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert the keras model to a basenji model #180

Open
houruiyan opened this issue Oct 9, 2023 · 1 comment
Open

How to convert the keras model to a basenji model #180

houruiyan opened this issue Oct 9, 2023 · 1 comment

Comments

@houruiyan
Copy link

Dear Dr,

Thank you for your great package.

I want to do fine tuning. Because my output just has one track. But your human model has 5135 tracks. I use the seqnn_model.model transform your basenji model to keras model. Then correct the archeciture of model. But when I want to train, there are some problem.

image

This is my all code.

`import tensorflow as tf
import numpy as np
import keras
import os
import json
from basenji import seqnn
from keras.initializers import glorot_uniform
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam, RMSprop

Load pre-trained model

read model parameters

with open('/mnt/ruiyanhou/nfs_share2/variant_TSS/run_basenji/do_fine_tuning/params_human.json') as params_open:
params = json.load(params_open)

params_model = params['model']
params_model

params_train = params['train']
params_train

model_file='/mnt/ruiyanhou/nfs_share2/variant_TSS/run_basenji/do_fine_tuning/model_human.h5'
model_file

initialize model

seqnn_model = seqnn.SeqNN(params_model) #这一步是对类进行实例化
seqnn_model.restore(model_file,head_i=0 ,trunk=False) #然后调用这个类下面的方法;包括restore和build_ensemble

seqnn_model

seqnn_model.model.summary()

model = seqnn_model.model
model

tf.keras.utils.plot_model(model,to_file='simple.png',show_shapes=True)

activation_layer=model.get_layer('tf.nn.gelu_30')
activation_layer

activation_layer.output

base_model=Model(inputs=model.input,outputs=activation_layer.output)
base_model

freeze layers which would not be trained

def print_layer_trainable():
for layer in base_model.layers:
print("{0}:\t{1}".format(layer.trainable, layer.name))

print_layer_trainable()

base_model.trainable = False

for layer in base_model.layers:
layer.trainable = False

print_layer_trainable()

create a new model

new_model=Sequential()
new_model.add(base_model)
new_last_layer=tf.keras.layers.Dense(units=1, activation="softplus")
new_model.add(new_last_layer)
new_model

new_model.summary()

for layer in new_model.layers:
print(layer)
print(layer.trainable)
print(len(layer.weights))
print(len(layer.trainable_weights))
print(len(layer.non_trainable_weights))

train this model in new dataset

from basenji import trainer
import pandas as pd
from basenji import dataset

read datasets

data_dirs=['/mnt/ruiyanhou/nfs_share2/variant_TSS/run_basenji/run_basenji/data_out']
train_data = []
eval_data = []

for data_dir in data_dirs:
# set strand pairs
targets_df = pd.read_csv('%s/targets.txt'%data_dir, sep='\t', index_col=0)

# load train data
train_data.append(dataset.SeqDataset(data_dir,
split_label='train',
batch_size=params_train['batch_size'],
shuffle_buffer=params_train.get('shuffle_buffer', 128),
mode='train'))

# load eval data
eval_data.append(dataset.SeqDataset(data_dir,
split_label='valid',
batch_size=params_train['batch_size'],
mode='eval'))

train_data

eval_data

initialize trainer

out_dir='/mnt/ruiyanhou/nfs_share2/variant_TSS/run_basenji/run_basenji/unit_2_output'

seqnn_trainer = trainer.Trainer(params_train, train_data,
eval_data, out_dir)

seqnn_trainer

compile model

seqnn_trainer.compile(new_model)`

Any suggestion? Thank you for your help!

@davek44
Copy link
Contributor

davek44 commented Oct 17, 2023

In your code, it's crashing because my Trainer class's compile method expects to be given a SeqNN class object. The SeqNN class holds on to the Keras model. See https://github.com/calico/basenji/blob/master/basenji/seqnn.py#L175

An alternative that might be easier for you would be to save predictions for your sequences for the 5313 tracks, and fit a ridge regression afterwards for your single transfer learn track. On several occasions, we've had better luck with that than with dropping and replacing the final layer.

Finally, depending on the task, you will likely get better results using our newer Enformer or Borzoi models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants