from katlas.core import *
import pandas as pd
Datahub
Kinase basic information
The kinase info (kinase group, family and subfamily) is from Coral github, sequence info is from Uniprot, kinase domain sequence is from kinase.com, subcellular location data is from Zhang et al.
= Data.get_kinase_info()
info info
kinase | ID_coral | uniprot | ID_HGNC | group | family | subfamily_coral | subfamily | in_ST_paper | in_Tyr_paper | ... | cytosol | cytoskeleton | plasma membrane | mitochondrion | Golgi apparatus | endoplasmic reticulum | vesicle | centrosome | aggresome | main_location | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AAK1 | AAK1 | Q2M2I8 | AAK1 | Other | NAK | NaN | NAK | 1 | 0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | ABL1 | ABL1 | P00519 | ABL1 | TK | Abl | NaN | Abl | 0 | 1 | ... | 6.0 | NaN | 4.0 | NaN | NaN | NaN | NaN | NaN | NaN | cytosol |
2 | ABL2 | ABL2 | P42684 | ABL2 | TK | Abl | NaN | Abl | 0 | 1 | ... | 4.0 | 6.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | cytoskeleton |
3 | TNK2 | ACK | Q07912 | TNK2 | TK | Ack | NaN | Ack | 0 | 1 | ... | NaN | NaN | NaN | NaN | NaN | NaN | 8.0 | NaN | 2.0 | vesicle |
4 | ACVR2A | ACTR2 | P27037 | ACVR2A | TKL | STKR | STKR2 | STKR2 | 1 | 0 | ... | 5.0 | NaN | NaN | NaN | NaN | 5.0 | NaN | NaN | NaN | cytosol |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
518 | YSK1 | YSK1 | O00506 | STK25 | STE | STE20 | YSK | YSK | 1 | 0 | ... | 6.0 | NaN | NaN | NaN | 4.0 | NaN | NaN | NaN | NaN | cytosol |
519 | ZAK | ZAK | Q9NYL2 | MAP3K20 | TKL | MLK | ZAK | ZAK | 1 | 0 | ... | 5.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | nucleus |
520 | ZAP70 | ZAP70 | P43403 | ZAP70 | TK | Syk | NaN | Syk | 0 | 1 | ... | 5.0 | NaN | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | cytosol |
521 | EEF2K | eEF2K | O00418 | EEF2K | Atypical | Alpha | eEF2K | eEF2K | 1 | 0 | ... | 9.0 | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | cytosol |
522 | FAM20C | FAM20C | Q8IXL6 | FAM20C | Atypical | FAM20C | NaN | FAM20C | 1 | 0 | ... | 2.0 | NaN | NaN | NaN | 7.0 | 1.0 | NaN | NaN | NaN | Golgi apparatus |
523 rows × 30 columns
info.columns
Index(['kinase', 'ID_coral', 'uniprot', 'ID_HGNC', 'group', 'family',
'subfamily_coral', 'subfamily', 'in_ST_paper', 'in_Tyr_paper',
'in_cddm', 'pseudo', 'pspa_category_small', 'pspa_category_big',
'cddm_big', 'cddm_small', 'length', 'human_uniprot_sequence',
'kinasecom_domain', 'nucleus', 'cytosol', 'cytoskeleton',
'plasma membrane', 'mitochondrion', 'Golgi apparatus',
'endoplasmic reticulum', 'vesicle', 'centrosome', 'aggresome',
'main_location'],
dtype='object')
Kinase-substrate dataset
A combination of Sugiyama et al. and PhosphoSitePlus kinase-substrate dataset
Data.get_ks_dataset()
kin_sub_site | kinase_uniprot | substrate_uniprot | site | source | substrate_genes | substrate_phosphoseq | position | site_seq | sub_site | |
---|---|---|---|---|---|---|---|---|---|---|
0 | O00141_A4FU28_S140 | O00141 | A4FU28 | S140 | Sugiyama | CTAGE9 | MEEPGATPQPYLGLVLEELGRVVAALPESMRPDENPYGFPSELVVC... | 140 | AAAEEARSLEATCEKLSRsNsELEDEILCLEKDLKEEKSKH | A4FU28_S140 |
1 | O00141_O00141_S252 | O00141 | O00141 | S252 | Sugiyama | SGK1 SGK | MTVKTEAAKGTLTYSRMRGMVAILIAFMKQRRMGLNDFIQKIANNS... | 252 | SQGHIVLTDFGLCKENIEHNsTtstFCGtPEyLAPEVLHKQ | O00141_S252 |
2 | O00141_O00141_S255 | O00141 | O00141 | S255 | Sugiyama | SGK1 SGK | MTVKTEAAKGTLTYSRMRGMVAILIAFMKQRRMGLNDFIQKIANNS... | 255 | HIVLTDFGLCKENIEHNsTtstFCGtPEyLAPEVLHKQPYD | O00141_S255 |
3 | O00141_O00141_S397 | O00141 | O00141 | S397 | Sugiyama | SGK1 SGK | MTVKTEAAKGTLTYSRMRGMVAILIAFMKQRRMGLNDFIQKIANNS... | 397 | sGPNDLRHFDPEFTEEPVPNsIGKsPDsVLVTAsVKEAAEA | O00141_S397 |
4 | O00141_O00141_S404 | O00141 | O00141 | S404 | Sugiyama | SGK1 SGK | MTVKTEAAKGTLTYSRMRGMVAILIAFMKQRRMGLNDFIQKIANNS... | 404 | HFDPEFTEEPVPNsIGKsPDsVLVTAsVKEAAEAFLGFsYA | O00141_S404 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
187061 | Q9Y6R4_P62273_Y7 | Q9Y6R4 | P62273 | Y7 | Sugiyama | RPS29 | MGHQQLyWsHPRKFGQGSRSCRVCSNRHGLIRKyGLNMCRQCFRQY... | 7 | ______________MGHQQLyWsHPRKFGQGSRSCRVCSNR | P62273_Y7 |
187062 | Q9Y6R4_Q86W56_Y832 | Q9Y6R4 | Q86W56 | Y832 | Sugiyama | PARG | MNAGPGCEPCTKRPRWGAATtsPAASDARSFPSRQRRVLDPKDAHV... | 832 | DDWQRRCTEIVAIDALHFRRyLDQFVPEKMRRELNKAYCGF | Q86W56_Y832 |
187063 | Q9Y6R4_Q9Y6R4_T1324 | Q9Y6R4 | Q9Y6R4 | T1324 | Sugiyama | MAP3K4 KIAA0213 MAPKKK4 MEKK4 MTK1 | MREAAAALVPPPAFAVTPAAAMEEPPPPPPPPPPPPEPETESEPEC... | 1324 | FEEKRYREMRRKNIIGQVCDtPKSyDNVMHVGLRKVTFKWQ | Q9Y6R4_T1324 |
187064 | Q9Y6R4_Q9Y6R4_T1494 | Q9Y6R4 | Q9Y6R4 | T1494 | SIGNOR|EPSD|PSP | MAP3K4 KIAA0213 MAPKKK4 MEKK4 MTK1 | MREAAAALVPPPAFAVTPAAAMEEPPPPPPPPPPPPEPETESEPEC... | 1494 | SGLIKLGDFGCSVKLKNNAQtMPGEVNSTLGTAAYMAPEVI | Q9Y6R4_T1494 |
187065 | Q9Y6R4_Q9Y6R4_Y1328 | Q9Y6R4 | Q9Y6R4 | Y1328 | Sugiyama | MAP3K4 KIAA0213 MAPKKK4 MEKK4 MTK1 | MREAAAALVPPPAFAVTPAAAMEEPPPPPPPPPPPPEPETESEPEC... | 1328 | RYREMRRKNIIGQVCDtPKSyDNVMHVGLRKVTFKWQRGNK | Q9Y6R4_Y1328 |
187066 rows × 10 columns
Phosphoproteomics data
PhosphoSitePlus human
Data.get_psp_human_site()
gene | protein | uniprot | site | gene_site | SITE_GRP_ID | species | site_seq | LT_LIT | MS_LIT | MS_CST | CST_CAT# | Ambiguous_Site | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | YWHAB | 14-3-3 beta | P31946 | T2 | YWHAB_T2 | 15718712 | human | ______MtMDksELV | NaN | 3.0 | 1.0 | None | 0 |
1 | YWHAB | 14-3-3 beta | P31946 | S6 | YWHAB_S6 | 15718709 | human | __MtMDksELVQkAk | NaN | 8.0 | NaN | None | 0 |
2 | YWHAB | 14-3-3 beta | P31946 | Y21 | YWHAB_Y21 | 3426383 | human | LAEQAERyDDMAAAM | NaN | NaN | 4.0 | None | 0 |
3 | YWHAB | 14-3-3 beta | P31946 | T32 | YWHAB_T32 | 23077803 | human | AAAMkAVtEQGHELs | NaN | NaN | 1.0 | None | 0 |
4 | YWHAB | 14-3-3 beta | P31946 | S39 | YWHAB_S39 | 27442700 | human | tEQGHELsNEERNLL | NaN | 4.0 | NaN | None | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
240006 | ZZZ3 | ZZZ3 | Q8IYH5 | S474 | ZZZ3_S474 | 482028 | human | PsAKESAsQHITEEE | NaN | 1.0 | NaN | None | 0 |
240007 | ZZZ3 | ZZZ3 | Q8IYH5 | S606 | ZZZ3_S606 | 23077718 | human | GLPARPksPLDPKKD | NaN | 6.0 | 4.0 | None | 0 |
240008 | ZZZ3 | ZZZ3 | Q8IYH5 | Y670 | ZZZ3_Y670 | 23077724 | human | LEQLLIKyPPEEVEs | NaN | NaN | 1.0 | None | 0 |
240009 | ZZZ3 | ZZZ3 | Q8IYH5 | S677 | ZZZ3_S677 | 23077721 | human | yPPEEVEsRRWQKIA | NaN | NaN | 1.0 | None | 0 |
240010 | ZZZ3 | ZZZ3 | Q8IYH5 | S777 | ZZZ3_S777 | 41455930 | human | NTAVEDAsDDESIPI | NaN | 2.0 | NaN | None | 0 |
240011 rows × 13 columns
Ochoa et al. dataset
Data.get_ochoa_site()
uniprot | position | residue | is_disopred | disopred_score | log10_hotspot_pval_min | isHotspot | uniprot_position | functional_score | current_uniprot | name | gene | Sequence | is_valid | site_seq | gene_site | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A0A075B6Q4 | 24 | S | True | 0.91 | 6.839384 | True | A0A075B6Q4_24 | 0.149257 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | VDDEKGDSNDDYDSA | A0A075B6Q4_S24 |
1 | A0A075B6Q4 | 35 | S | True | 0.87 | 9.192622 | False | A0A075B6Q4_35 | 0.136966 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | YDSAGLLSDEDCMSV | A0A075B6Q4_S35 |
2 | A0A075B6Q4 | 57 | S | False | 0.28 | 0.818834 | False | A0A075B6Q4_57 | 0.125364 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | IADHLFWSEETKSRF | A0A075B6Q4_S57 |
3 | A0A075B6Q4 | 68 | S | False | 0.03 | 0.375986 | False | A0A075B6Q4_68 | 0.119811 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | KSRFTEYSMTSSVMR | A0A075B6Q4_S68 |
4 | A0A075B6Q4 | 71 | S | False | 0.05 | 0.000000 | False | A0A075B6Q4_71 | 0.095193 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | FTEYSMTSSVMRRNE | A0A075B6Q4_S71 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
112276 | V9GYY5 | 127 | S | True | 0.97 | 3.193174 | False | V9GYY5_127 | 0.292446 | V9GYY5 | V9GYY5_HUMAN | None | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | True | EGGAGDRSEEEASST | V9GYY5_S127 |
112277 | V9GYY5 | 132 | S | True | 0.93 | 2.055830 | False | V9GYY5_132 | 0.219329 | V9GYY5 | V9GYY5_HUMAN | None | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | True | DRSEEEASSTEKPTK | V9GYY5_S132 |
112278 | V9GYY5 | 133 | S | True | 0.89 | 2.055830 | False | V9GYY5_133 | 0.202808 | V9GYY5 | V9GYY5_HUMAN | None | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | True | RSEEEASSTEKPTKA | V9GYY5_S133 |
112279 | V9GYY5 | 134 | T | True | 0.83 | 2.055830 | False | V9GYY5_134 | 0.187417 | V9GYY5 | V9GYY5_HUMAN | None | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | True | SEEEASSTEKPTKAL | V9GYY5_T134 |
112280 | V9GYY5 | 138 | T | True | 0.82 | 0.726611 | False | V9GYY5_138 | 0.121025 | V9GYY5 | V9GYY5_HUMAN | None | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | True | ASSTEKPTKALPRKS | V9GYY5_T138 |
112281 rows × 16 columns
Combined Ochoa and PhosphoSitePlus
Data.get_combine_site_psp_ochoa()
uniprot | gene | site | site_seq | source | AM_pathogenicity | CDDM_upper | CDDM_max_score | |
---|---|---|---|---|---|---|---|---|
0 | A0A024R4G9 | C19orf48 | S20 | ITGSRLLSMVPGPAR | psp | NaN | PRKX,AKT1,PKG1,P90RSK,HIPK4,AKT3,HIPK1,PKACB,H... | 2.407041 |
1 | A0A075B6Q4 | None | S24 | VDDEKGDSNDDYDSA | ochoa | NaN | CK2A2,CK2A1,GRK7,GRK5,CK1G1,CK1A,IKKA,CK1G2,CA... | 2.295654 |
2 | A0A075B6Q4 | None | S35 | YDSAGLLSDEDCMSV | ochoa | NaN | CK2A2,CK2A1,IKKA,ATM,IKKB,CAMK1D,MARK2,GRK7,IK... | 2.488683 |
3 | A0A075B6Q4 | None | S57 | IADHLFWSEETKSRF | ochoa | NaN | GRK7,CK2A1,CK2A2,PKN2,GRK1,GRK5,MARK1,MARK2,UL... | 1.851894 |
4 | A0A075B6Q4 | None | S68 | KSRFTEYSMTSSVMR | ochoa | NaN | AKT1,P90RSK,AKT3,SGK1,AKT2,NDR2,RSK2,P70S6K,RS... | 2.026384 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
121414 | V9GYY5 | None | S127 | EGGAGDRSEEEASST | ochoa | NaN | CK2A1,CK2A2,GRK7,GRK5,ALK2,GRK1,CK1E,PLK3,CK1A... | 2.665606 |
121415 | V9GYY5 | None | S132 | DRSEEEASSTEKPTK | ochoa | NaN | CK2A2,CK2A1,GRK7,TGFBR1,GRK2,ALK2,PLK3,CLK3,BM... | 2.445179 |
121416 | V9GYY5 | None | S133 | RSEEEASSTEKPTKA | ochoa | NaN | CK2A1,ATR,GRK1,CK1G1,PLK3,CLK3,GRK7,CK1G2,MARK... | 2.090739 |
121417 | V9GYY5 | None | T134 | SEEEASSTEKPTKAL | ochoa | NaN | ASK1,PERK,EEF2K,MAP2K4,MEKK2,MST1,BMPR1B,OSR1,... | 1.832532 |
121418 | V9GYY5 | None | T138 | ASSTEKPTKALPRKS | ochoa | NaN | ASK1,MEK2,MPSK1,TNIK,PBK,MST2,MINK,NEK4,LKB1,MEK5 | 1.807565 |
121419 rows × 8 columns
Phosphorylated version of the above version
The Ochoa dataset was first phophorylated based on the given site info, then combined with the PSP dataset with phosphorylation status.
Data.get_combine_site_phosphorylated()
uniprot | gene | site | site_seq | source | AM_pathogenicity | CDDM | PSPA | CDDM_max_score | PSPA_max_score | |
---|---|---|---|---|---|---|---|---|---|---|
0 | A0A024R4G9 | C19orf48 | S20 | ITGSRLLsMVPGPAR | psp | NaN | PRKX,PKG1,AKT1,AKT3,HIPK4,P90RSK,PKACB,PKACA,P... | MAPKAPK5,AKT1,RSK3,P70S6K,MAPKAPK3,AKT2,DYRK1A... | 2.339278 | 3.726109 |
1 | A0A075B6Q4 | None | S24 | VDDEKGDsNDDYDSA | ochoa | NaN | CK2A2,CK2A1,GRK7,GRK5,CK1G1,IKKA,CAMK1D,MARK2,... | CAMK2B,CK2A2,CAMK2A,CK2A1,GRK7,TLK2,FAM20C,CAM... | 2.253027 | 4.940056 |
2 | A0A075B6Q4 | None | S35 | YDSAGLLsDEDCMSV | ochoa | NaN | CK2A2,CK2A1,ATM,CAMK1D,IKKB,IKKA,MARK2,MARK1,G... | CK2A1,CK2A2,FAM20C,CDC7,GRK6,SMG1,ALK2,TGFBR1,... | 2.396014 | 5.803230 |
3 | A0A075B6Q4 | None | S57 | IADHLFWsEETKSRF | ochoa | NaN | GRK7,CK2A2,PKN2,CK2A1,GRK1,MARK1,TSSK2,MARK2,N... | FAM20C,ACVR2B,GRK1,CDC7,BMPR1B,BMPR1A,ACVR2A,D... | 1.793644 | 4.038678 |
4 | A0A075B6Q4 | None | S68 | KSRFTEYsMTssVMR | ochoa | NaN | AKT1,P90RSK,RSK2,RSK4,CLK3,NDR2,P70S6K,AKT3,SG... | GSK3B,GSK3A,CK1G2,PLK2,PLK3,TGFBR1,TLK2,GRK3,A... | 1.789278 | 7.268416 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
120099 | V9GYY5 | None | S127 | EGGAGDRsEEEAsst | ochoa | NaN | CK2A2,CK2A1,GRK7,ALK2,GRK5,GRK1,CK1E,PLK3,CK1A... | FAM20C,CK2A1,CK2A2,GRK7,BMPR1B,GRK1,ALK2,BMPR1... | 2.575281 | 7.326407 |
120100 | V9GYY5 | None | S132 | DRsEEEAsstEKPtK | ochoa | NaN | CK2A2,CK2A1,GRK7,TGFBR1,ALK2,GRK2,PLK3,BMPR1B,... | BMPR1B,BMPR1A,GRK3,CK2A1,PLK2,GRK7,ACVR2A,GRK2... | 2.359323 | 9.746005 |
120101 | V9GYY5 | None | S133 | RsEEEAsstEKPtKA | ochoa | NaN | GRK1,CK2A1,CK1G1,PLK3,GRK7,CK2A2,CK1G2,CLK3,AT... | BMPR1B,CK1G2,GRK7,BMPR1A,GRK3,PLK2,GRK1,ACVR2B... | 2.019862 | 5.370222 |
120102 | V9GYY5 | None | T134 | sEEEAsstEKPtKAL | ochoa | NaN | PERK,ASK1,EEF2K,MST1,BMPR1B,PBK,MEKK2,OSR1,MST... | CK1G2,GSK3A,ALPHAK3,GRK1,GRK7,GSK3B,BMPR1B,BMP... | 1.723089 | 7.009429 |
120103 | V9GYY5 | None | T138 | AsstEKPtKALPRKS | ochoa | NaN | ASK1,PBK,TNIK,MPSK1,MINK,MST2,NEK4,MEK2,MST1,BUB1 | CK1G3,CK1G2,CK1A2,CK1D,CK1A,GRK3,PASK,GRK2,CK1... | 1.651888 | 4.350109 |
120104 rows × 10 columns
Human phosphoproteome
With phosphorylated protein sequence and site sequence
Data.get_human_site()
substrate_uniprot | substrate_genes | site | source | AM_pathogenicity | substrate_sequence | substrate_species | sub_site | substrate_phosphoseq | position | site_seq | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | A0A024R4G9 | C19orf48 MGC13170 hCG_2008493 | S20 | psp | NaN | MTVLEAVLEIQAITGSRLLSMVPGPARPPGSCWDPTQCTRTWLLSH... | Homo sapiens (Human) | A0A024R4G9_S20 | MTVLEAVLEIQAITGSRLLsMVPGPARPPGSCWDPTQCTRTWLLSH... | 20 | _MTVLEAVLEIQAITGSRLLsMVPGPARPPGSCWDPTQCTR |
1 | A0A075B6Q4 | None | S24 | ochoa | NaN | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | Homo sapiens (Human) | A0A075B6Q4_S24 | MDIQKSENEDDSEWEDVDDEKGDsNDDYDSAGLLsDEDCMSVPGKT... | 24 | QKSENEDDSEWEDVDDEKGDsNDDYDSAGLLsDEDCMSVPG |
2 | A0A075B6Q4 | None | S35 | ochoa | NaN | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | Homo sapiens (Human) | A0A075B6Q4_S35 | MDIQKSENEDDSEWEDVDDEKGDsNDDYDSAGLLsDEDCMSVPGKT... | 35 | EDVDDEKGDsNDDYDSAGLLsDEDCMSVPGKTHRAIADHLF |
3 | A0A075B6Q4 | None | S57 | ochoa | NaN | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | Homo sapiens (Human) | A0A075B6Q4_S57 | MDIQKSENEDDSEWEDVDDEKGDsNDDYDSAGLLsDEDCMSVPGKT... | 57 | EDCMSVPGKTHRAIADHLFWsEETKSRFTEYsMTssVMRRN |
4 | A0A075B6Q4 | None | S68 | ochoa | NaN | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | Homo sapiens (Human) | A0A075B6Q4_S68 | MDIQKSENEDDSEWEDVDDEKGDsNDDYDSAGLLsDEDCMSVPGKT... | 68 | RAIADHLFWsEETKSRFTEYsMTssVMRRNEQLTLHDERFE |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
121327 | V9GYY5 | None | S127 | ochoa | NaN | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | Homo sapiens (Human) | V9GYY5_S127 | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | 127 | DLsGARLLGLtPPEGGAGDRsEEEAsstEKPtKALPRKSRD |
121328 | V9GYY5 | None | S132 | ochoa | NaN | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | Homo sapiens (Human) | V9GYY5_S132 | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | 132 | RLLGLtPPEGGAGDRsEEEAsstEKPtKALPRKSRDPLLSQ |
121329 | V9GYY5 | None | S133 | ochoa | NaN | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | Homo sapiens (Human) | V9GYY5_S133 | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | 133 | LLGLtPPEGGAGDRsEEEAsstEKPtKALPRKSRDPLLSQR |
121330 | V9GYY5 | None | T134 | ochoa | NaN | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | Homo sapiens (Human) | V9GYY5_T134 | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | 134 | LGLtPPEGGAGDRsEEEAsstEKPtKALPRKSRDPLLSQRI |
121331 | V9GYY5 | None | T138 | ochoa | NaN | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | Homo sapiens (Human) | V9GYY5_T138 | KRDGDDRRPRLVLSFDEEKRREYLTGFHKRKVERKKAAIEEIKQRL... | 138 | PPEGGAGDRsEEEAsstEKPtKALPRKSRDPLLSQRISSLT |
119955 rows × 11 columns
CPTAC data
Query specific tumor type
CPTAC.list_cancer()
['HNSCC', 'GBM', 'COAD', 'CCRCC', 'LSCC', 'BRCA', 'UCEC', 'LUAD', 'PDAC', 'OV']
To load CPTAC phosphorylation site information, use CPTAC.get_id()
. Fold change of various conditions can be acquired through LinkedOmics or LinkedOmicsKB. Use is_KB
to indicate whether the phosphorylation site information is for LinkedOmics or LinkedOmicsKB.
= CPTAC.get_id('CCRCC',is_KB=True)
tumor = CPTAC.get_id('CCRCC',is_KB=True, is_Tumor=False)
normal tumor.head()
the CCRCC dataset length is: 54238
after id mapping, the length is 213737
0 sites does not have a mapped gene name
after removing duplicates of protein_site, the length is 212814
the CCRCC dataset length is: 53152
after id mapping, the length is 209188
0 sites does not have a mapped gene name
after removing duplicates of protein_site, the length is 208298
gene | site | site_seq | protein | gene_name | gene_site | protein_site | |
---|---|---|---|---|---|---|---|
0 | ENSG00000003056.8 | S267 | DDQLGEESEERDDHL | ENSP00000000412.3 | M6PR | M6PR_S267 | ENSP00000000412_S267 |
1 | ENSG00000003056.8 | S267 | DDQLGEESEERDDHL | ENSP00000440488.2 | M6PR | M6PR_S267 | ENSP00000440488_S267 |
2 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000003302.4 | USP28 | USP28_S1053 | ENSP00000003302_S1053 |
3 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000445743.1 | USP28 | USP28_S1053 | ENSP00000445743_S1053 |
4 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000442431.1 | USP28 | USP28_S1053 | ENSP00000442431_S1053 |
Unique Ensemble ProteinID + site
Query all of cancer types and compile
Data.get_cptac_ensembl_site()
gene | site | site_seq | protein | gene_name | gene_site | protein_site | |
---|---|---|---|---|---|---|---|
0 | ENSG00000003056.8 | S267 | DDQLGEESEERDDHL | ENSP00000000412.3 | M6PR | M6PR_S267 | ENSP00000000412_S267 |
1 | ENSG00000003056.8 | S267 | DDQLGEESEERDDHL | ENSP00000440488.2 | M6PR | M6PR_S267 | ENSP00000440488_S267 |
2 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000003302.4 | USP28 | USP28_S1053 | ENSP00000003302_S1053 |
3 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000445743.1 | USP28 | USP28_S1053 | ENSP00000445743_S1053 |
4 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000442431.1 | USP28 | USP28_S1053 | ENSP00000442431_S1053 |
... | ... | ... | ... | ... | ... | ... | ... |
488581 | ENSG00000173230.15 | S2878 | TSPAEVQSLKKAMSS | ENSP00000484083.1 | GOLGB1 | GOLGB1_S2878 | ENSP00000484083_S2878 |
488582 | ENSG00000143631.11 | S1642 | SHQEDRASHGHSAES | ENSP00000357789.1 | FLG | FLG_S1642 | ENSP00000357789_S1642 |
488583 | ENSG00000143631.11 | S495 | STGGRQGSHHEQARD | ENSP00000357789.1 | FLG | FLG_S495 | ENSP00000357789_S495 |
488584 | ENSG00000143631.11 | S648 | ASRNHHGSAQEQSRD | ENSP00000357789.1 | FLG | FLG_S648 | ENSP00000357789_S648 |
488585 | ENSG00000143520.6 | S2310 | DTTRHGHSGYGQSTQ | ENSP00000373370.4 | FLG2 | FLG2_S2310 | ENSP00000373370_S2310 |
488586 rows × 7 columns
Unique site sequences
Data.get_cptac_unique_site()
site_seq | gene_site | num_site | acceptor | |
---|---|---|---|---|
0 | AAAAAAASFPWSAFG | ZBTB7A_S182 | 1 | S |
1 | AAAAAAASGAAGGGG | INTS3_S16 | 1 | S |
2 | AAAAAAASGALLGAY | TMEM64_S62 | 1 | S |
3 | AAAAAAASGGAGSDN | PBX1_S136 | 1 | S |
4 | AAAAAAASGGGVSPD | PBX2_S146 | 1 | S |
... | ... | ... | ... | ... |
125471 | ______MTMETLPKV | PIRT_T2 | 1 | T |
125472 | ______MTPPPPPPP | ESRP2_T2 | 1 | T |
125473 | ______MTVSGPGTP | UNC45A_T2 | 1 | T |
125474 | ______MYPAGPPAG | TIGD5_Y2 | 1 | Y |
125475 | _______SPASLPLA | RFLNB_S1 | 1 | S |
125476 rows × 4 columns
Unique gene name+site
Data.get_cptac_gene_site()
gene | site | site_seq | protein | gene_name | gene_site | protein_site | |
---|---|---|---|---|---|---|---|
0 | ENSG00000003056.8 | S267 | DDQLGEESEERDDHL | ENSP00000000412.3 | M6PR | M6PR_S267 | ENSP00000000412_S267 |
1 | ENSG00000048028.11 | S1053 | PPTIRPNSPYDLCSR | ENSP00000003302.4 | USP28 | USP28_S1053 | ENSP00000003302_S1053 |
2 | ENSG00000004776.13 | S16 | PSWLRRASAPLPGLS | ENSP00000004982.3 | HSPB6 | HSPB6_S16 | ENSP00000004982_S16 |
3 | ENSG00000005175.10 | S116 | DSTHESLSQESESEE | ENSP00000005386.3 | RPAP3 | RPAP3_S116 | ENSP00000005386_S116 |
4 | ENSG00000005175.10 | S121 | SLSQESESEEDGIHV | ENSP00000005386.3 | RPAP3 | RPAP3_S121 | ENSP00000005386_S121 |
... | ... | ... | ... | ... | ... | ... | ... |
126220 | ENSG00000173230.15 | S2878 | TSPAEVQSLKKAMSS | ENSP00000341848.5 | GOLGB1 | GOLGB1_S2878 | ENSP00000341848_S2878 |
126221 | ENSG00000143631.11 | S1642 | SHQEDRASHGHSAES | ENSP00000357789.1 | FLG | FLG_S1642 | ENSP00000357789_S1642 |
126222 | ENSG00000143631.11 | S495 | STGGRQGSHHEQARD | ENSP00000357789.1 | FLG | FLG_S495 | ENSP00000357789_S495 |
126223 | ENSG00000143631.11 | S648 | ASRNHHGSAQEQSRD | ENSP00000357789.1 | FLG | FLG_S648 | ENSP00000357789_S648 |
126224 | ENSG00000143520.6 | S2310 | DTTRHGHSGYGQSTQ | ENSP00000373370.4 | FLG2 | FLG2_S2310 | ENSP00000373370_S2310 |
126225 rows × 7 columns
PSSM: CDDM data
Data.get_cddm()
substrate | -7P | -7G | -7A | -7C | -7S | -7T | -7V | -7I | -7L | -7M | ... | 7E | 7s | 7t | 7y | 0s | 0t | 0y | 0S | 0T | 0Y |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
SRC | 0.055749 | 0.064895 | 0.060105 | 0.010017 | 0.045732 | 0.033101 | 0.049216 | 0.037892 | 0.080139 | 0.020035 | ... | 0.085192 | 0.029438 | 0.013381 | 0.017841 | 0.038927 | 0.034602 | 0.926471 | 0.038927 | 0.034602 | 0.926471 |
EPHA3 | 0.042881 | 0.075316 | 0.068169 | 0.013194 | 0.039582 | 0.031336 | 0.048378 | 0.043430 | 0.079714 | 0.021440 | ... | 0.092551 | 0.024266 | 0.013544 | 0.022009 | 0.054526 | 0.035442 | 0.910033 | 0.054526 | 0.035442 | 0.910033 |
FES | 0.049633 | 0.075578 | 0.065990 | 0.012972 | 0.036097 | 0.029893 | 0.055274 | 0.040045 | 0.080090 | 0.014100 | ... | 0.084483 | 0.024713 | 0.017816 | 0.022414 | 0.038699 | 0.026921 | 0.934380 | 0.038699 | 0.026921 | 0.934380 |
NTRK3 | 0.042771 | 0.074699 | 0.073494 | 0.012048 | 0.034940 | 0.022289 | 0.052410 | 0.044578 | 0.080723 | 0.016867 | ... | 0.079679 | 0.026560 | 0.022236 | 0.022236 | 0.101796 | 0.060479 | 0.837725 | 0.101796 | 0.060479 | 0.837725 |
ALK | 0.045482 | 0.076214 | 0.070682 | 0.014136 | 0.034419 | 0.028273 | 0.049170 | 0.035648 | 0.079902 | 0.018439 | ... | 0.081835 | 0.028518 | 0.018599 | 0.023559 | 0.059038 | 0.044431 | 0.896531 | 0.059038 | 0.044431 | 0.896531 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
CDK8 | 0.056818 | 0.079545 | 0.090909 | 0.011364 | 0.011364 | 0.022727 | 0.068182 | 0.045455 | 0.011364 | 0.011364 | ... | 0.103448 | 0.045977 | 0.022989 | 0.000000 | 0.752809 | 0.202247 | 0.044944 | 0.752809 | 0.202247 | 0.044944 |
BUB1 | 0.023256 | 0.069767 | 0.081395 | 0.000000 | 0.023256 | 0.011628 | 0.058140 | 0.023256 | 0.058140 | 0.023256 | ... | 0.105882 | 0.058824 | 0.070588 | 0.011765 | 0.558140 | 0.406977 | 0.034884 | 0.558140 | 0.406977 | 0.034884 |
MEKK3 | 0.083333 | 0.071429 | 0.059524 | 0.000000 | 0.071429 | 0.000000 | 0.047619 | 0.059524 | 0.059524 | 0.011905 | ... | 0.073171 | 0.048780 | 0.012195 | 0.000000 | 0.458824 | 0.470588 | 0.070588 | 0.458824 | 0.470588 | 0.070588 |
MAP2K3 | 0.045977 | 0.057471 | 0.114943 | 0.000000 | 0.045977 | 0.045977 | 0.022989 | 0.022989 | 0.022989 | 0.011494 | ... | 0.109756 | 0.085366 | 0.036585 | 0.000000 | 0.528090 | 0.191011 | 0.280899 | 0.528090 | 0.191011 | 0.280899 |
GRK1 | 0.060241 | 0.072289 | 0.084337 | 0.000000 | 0.048193 | 0.036145 | 0.024096 | 0.060241 | 0.012048 | 0.012048 | ... | 0.197368 | 0.039474 | 0.000000 | 0.013158 | 0.831325 | 0.156627 | 0.012048 | 0.831325 | 0.156627 | 0.012048 |
289 rows × 328 columns
All uppercase
Data.get_cddm_upper()
substrate | -7P | -7G | -7A | -7C | -7S | -7T | -7V | -7I | -7L | -7M | ... | 7Q | 7N | 7D | 7E | 0S | 0T | 0Y | 0s | 0t | 0y |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
SRC | 0.055749 | 0.064895 | 0.060105 | 0.010017 | 0.071429 | 0.046167 | 0.049216 | 0.037892 | 0.080139 | 0.020035 | ... | 0.045941 | 0.036574 | 0.074487 | 0.085192 | 0.038927 | 0.034602 | 0.926471 | 0.038927 | 0.034602 | 0.926471 |
EPHA3 | 0.042881 | 0.075316 | 0.068169 | 0.013194 | 0.064871 | 0.042881 | 0.048378 | 0.043430 | 0.079714 | 0.021440 | ... | 0.046275 | 0.046840 | 0.073928 | 0.092551 | 0.054526 | 0.035442 | 0.910033 | 0.054526 | 0.035442 | 0.910033 |
FES | 0.049633 | 0.075578 | 0.065990 | 0.012972 | 0.059222 | 0.038353 | 0.055274 | 0.040045 | 0.080090 | 0.014100 | ... | 0.048276 | 0.044828 | 0.074138 | 0.084483 | 0.038699 | 0.026921 | 0.934380 | 0.038699 | 0.026921 | 0.934380 |
NTRK3 | 0.042771 | 0.074699 | 0.073494 | 0.012048 | 0.060843 | 0.039157 | 0.052410 | 0.044578 | 0.080723 | 0.016867 | ... | 0.040148 | 0.040148 | 0.071032 | 0.079679 | 0.101796 | 0.060479 | 0.837725 | 0.101796 | 0.060479 | 0.837725 |
ALK | 0.045482 | 0.076214 | 0.070682 | 0.014136 | 0.056546 | 0.041795 | 0.049170 | 0.035648 | 0.079902 | 0.018439 | ... | 0.046497 | 0.039058 | 0.068196 | 0.081835 | 0.059038 | 0.044431 | 0.896531 | 0.059038 | 0.044431 | 0.896531 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
CDK8 | 0.056818 | 0.079545 | 0.090909 | 0.011364 | 0.056818 | 0.090909 | 0.068182 | 0.045455 | 0.011364 | 0.011364 | ... | 0.080460 | 0.057471 | 0.080460 | 0.103448 | 0.752809 | 0.202247 | 0.044944 | 0.752809 | 0.202247 | 0.044944 |
BUB1 | 0.023256 | 0.069767 | 0.081395 | 0.000000 | 0.081395 | 0.069767 | 0.058140 | 0.023256 | 0.058140 | 0.023256 | ... | 0.023529 | 0.070588 | 0.035294 | 0.105882 | 0.558140 | 0.406977 | 0.034884 | 0.558140 | 0.406977 | 0.034884 |
MEKK3 | 0.083333 | 0.071429 | 0.059524 | 0.000000 | 0.083333 | 0.011905 | 0.047619 | 0.059524 | 0.059524 | 0.011905 | ... | 0.060976 | 0.024390 | 0.036585 | 0.073171 | 0.458824 | 0.470588 | 0.070588 | 0.458824 | 0.470588 | 0.070588 |
MAP2K3 | 0.045977 | 0.057471 | 0.114943 | 0.000000 | 0.068966 | 0.045977 | 0.022989 | 0.022989 | 0.022989 | 0.011494 | ... | 0.024390 | 0.060976 | 0.073171 | 0.109756 | 0.528090 | 0.191011 | 0.280899 | 0.528090 | 0.191011 | 0.280899 |
GRK1 | 0.060241 | 0.072289 | 0.084337 | 0.000000 | 0.084337 | 0.096386 | 0.024096 | 0.060241 | 0.012048 | 0.012048 | ... | 0.013158 | 0.026316 | 0.118421 | 0.197368 | 0.831325 | 0.156627 | 0.012048 | 0.831325 | 0.156627 | 0.012048 |
289 rows × 286 columns
Other kinases
Data.get_cddm_others().head()
substrate | -7P | -7G | -7A | -7C | -7S | -7T | -7V | -7I | -7L | -7M | ... | 7Q | 7N | 7D | 7E | 7s | 7t | 7y | 0s | 0t | 0y |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
LYNb | 0.045929 | 0.068894 | 0.061935 | 0.013222 | 0.034795 | 0.029923 | 0.050800 | 0.045233 | 0.083507 | 0.022269 | ... | 0.045032 | 0.036455 | 0.074339 | 0.082202 | 0.027162 | 0.019299 | 0.017870 | 0.038010 | 0.035245 | 0.926745 |
ABL1[T315I] | 0.046140 | 0.074534 | 0.066548 | 0.010648 | 0.039042 | 0.023957 | 0.055013 | 0.037267 | 0.075421 | 0.021295 | ... | 0.048693 | 0.035167 | 0.064022 | 0.079351 | 0.027953 | 0.022543 | 0.019838 | 0.085613 | 0.045013 | 0.869373 |
ABL1[E255K] | 0.039631 | 0.065438 | 0.060829 | 0.014747 | 0.044240 | 0.030415 | 0.053456 | 0.036866 | 0.067281 | 0.021198 | ... | 0.048598 | 0.039252 | 0.073832 | 0.085047 | 0.025234 | 0.018692 | 0.017757 | 0.062271 | 0.042125 | 0.895604 |
RET[M918T] | 0.046422 | 0.080271 | 0.066731 | 0.010638 | 0.029014 | 0.023211 | 0.038685 | 0.044487 | 0.069632 | 0.023211 | ... | 0.053202 | 0.042365 | 0.077833 | 0.086700 | 0.032512 | 0.023645 | 0.018719 | 0.042025 | 0.025788 | 0.932187 |
FGFR3[K650M] | 0.031985 | 0.072437 | 0.068674 | 0.015992 | 0.031044 | 0.026341 | 0.035748 | 0.038570 | 0.079962 | 0.018815 | ... | 0.045977 | 0.040230 | 0.076628 | 0.089080 | 0.028736 | 0.022031 | 0.021073 | 0.051115 | 0.027881 | 0.921004 |
5 rows × 325 columns
Data.get_cddm_others_info().head()
kinase | count | |
---|---|---|
0 | ALK | 1889 |
1 | ABL1 | 1837 |
2 | RET | 1769 |
3 | LYNb | 1694 |
4 | MET | 1485 |
PSSM: PSPA data
Normalized Tyr
Data.get_pspa_tyr_norm()
-5P | -5G | -5A | -5C | -5S | -5T | -5V | -5I | -5L | -5M | ... | 5E | 5s | 5t | 5y | 0S | 0T | 0Y | 0s | 0t | 0y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
ABL1 | 0.0668 | 0.0689 | 0.0646 | 0.0520 | 0.0564 | 0.0539 | 0.0485 | 0.0448 | 0.0520 | 0.0536 | ... | 0.0339 | 0.0254 | 0.0254 | 0.0337 | 0 | 0 | 1 | 0 | 0 | 1 |
TNK2 | 0.0679 | 0.0818 | 0.0627 | 0.0617 | 0.0529 | 0.0528 | 0.0419 | 0.0463 | 0.0437 | 0.0453 | ... | 0.0572 | 0.0364 | 0.0364 | 0.0572 | 0 | 0 | 1 | 0 | 0 | 1 |
ALK | 0.0675 | 0.0640 | 0.0590 | 0.0511 | 0.0476 | 0.0422 | 0.0455 | 0.0514 | 0.0546 | 0.0543 | ... | 0.0226 | 0.0181 | 0.0181 | 0.0172 | 0 | 0 | 1 | 0 | 0 | 1 |
ABL2 | 0.0687 | 0.0715 | 0.0611 | 0.0448 | 0.0537 | 0.0513 | 0.0467 | 0.0398 | 0.0462 | 0.0505 | ... | 0.0381 | 0.0252 | 0.0252 | 0.0289 | 0 | 0 | 1 | 0 | 0 | 1 |
AXL | 0.0656 | 0.0753 | 0.0535 | 0.0525 | 0.0468 | 0.0467 | 0.0459 | 0.0538 | 0.0507 | 0.0542 | ... | 0.0559 | 0.0413 | 0.0413 | 0.0455 | 0 | 0 | 1 | 0 | 0 | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
KDR | 0.0634 | 0.0672 | 0.0556 | 0.0517 | 0.0541 | 0.0526 | 0.0427 | 0.0420 | 0.0428 | 0.0476 | ... | 0.0387 | 0.0335 | 0.0335 | 0.0406 | 0 | 0 | 1 | 0 | 0 | 1 |
FLT4 | 0.0457 | 0.0531 | 0.0488 | 0.0553 | 0.0512 | 0.0471 | 0.0432 | 0.0499 | 0.0474 | 0.0530 | ... | 0.0528 | 0.0600 | 0.0600 | 0.0464 | 0 | 0 | 1 | 0 | 0 | 1 |
WEE1_TYR | 0.0531 | 0.0640 | 0.0559 | 0.0560 | 0.0433 | 0.0435 | 0.0568 | 0.0571 | 0.0637 | 0.0562 | ... | 0.0365 | 0.0453 | 0.0453 | 0.0490 | 0 | 0 | 1 | 0 | 0 | 1 |
YES1 | 0.0677 | 0.0571 | 0.0537 | 0.0530 | 0.0527 | 0.0505 | 0.0435 | 0.0375 | 0.0400 | 0.0463 | ... | 0.0482 | 0.0374 | 0.0374 | 0.0411 | 0 | 0 | 1 | 0 | 0 | 1 |
ZAP70 | 0.0602 | 0.0880 | 0.0623 | 0.0496 | 0.0471 | 0.0514 | 0.0465 | 0.0380 | 0.0307 | 0.0526 | ... | 0.0710 | 0.0862 | 0.0862 | 0.0605 | 0 | 0 | 1 | 0 | 0 | 1 |
93 rows × 236 columns
Normalized Ser/Thr
Data.get_pspa_st_norm()
-5P | -5G | -5A | -5C | -5S | -5T | -5V | -5I | -5L | -5M | ... | 4E | 4s | 4t | 4y | 0s | 0t | 0y | 0S | 0T | 0Y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
AAK1 | 0.0720 | 0.0245 | 0.0284 | 0.0456 | 0.0425 | 0.0425 | 0.0951 | 0.1554 | 0.0993 | 0.0864 | ... | 0.0457 | 0.0251 | 0.0251 | 0.0270 | 0.1013 | 1.0000 | 0.0 | 0.1013 | 1.0000 | 0.0 |
ACVR2A | 0.0415 | 0.0481 | 0.0584 | 0.0489 | 0.0578 | 0.0578 | 0.0598 | 0.0625 | 0.0596 | 0.0521 | ... | 0.0640 | 0.0703 | 0.0703 | 0.0589 | 0.9833 | 1.0000 | 0.0 | 0.9833 | 1.0000 | 0.0 |
ACVR2B | 0.0533 | 0.0517 | 0.0566 | 0.0772 | 0.0533 | 0.0533 | 0.0543 | 0.0442 | 0.0471 | 0.0516 | ... | 0.0697 | 0.0761 | 0.0761 | 0.0637 | 0.9593 | 1.0000 | 0.0 | 0.9593 | 1.0000 | 0.0 |
AKT1 | 0.0603 | 0.0594 | 0.0552 | 0.0605 | 0.0516 | 0.0516 | 0.0427 | 0.0435 | 0.0464 | 0.0505 | ... | 0.0312 | 0.0393 | 0.0393 | 0.0263 | 1.0000 | 0.6440 | 0.0 | 1.0000 | 0.6440 | 0.0 |
AKT2 | 0.0602 | 0.0617 | 0.0643 | 0.0582 | 0.0534 | 0.0534 | 0.0433 | 0.0418 | 0.0493 | 0.0513 | ... | 0.0350 | 0.0548 | 0.0548 | 0.0417 | 1.0000 | 0.6077 | 0.0 | 1.0000 | 0.6077 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
YANK2 | 0.0580 | 0.0699 | 0.0637 | 0.0602 | 0.0580 | 0.0580 | 0.0433 | 0.0470 | 0.0459 | 0.0469 | ... | 0.0452 | 0.1095 | 0.1095 | 0.6305 | 0.6321 | 1.0000 | 0.0 | 0.6321 | 1.0000 | 0.0 |
YANK3 | 0.0625 | 0.0776 | 0.0647 | 0.0598 | 0.0545 | 0.0545 | 0.0502 | 0.0537 | 0.0561 | 0.0543 | ... | 0.0862 | 0.1204 | 0.1204 | 0.5776 | 1.0000 | 0.8985 | 0.0 | 1.0000 | 0.8985 | 0.0 |
YSK1 | 0.0590 | 0.0713 | 0.0731 | 0.0606 | 0.0542 | 0.0542 | 0.0499 | 0.0471 | 0.0446 | 0.0529 | ... | 0.0267 | 0.0256 | 0.0256 | 0.0219 | 0.2558 | 1.0000 | 0.0 | 0.2558 | 1.0000 | 0.0 |
YSK4 | 0.0593 | 0.0728 | 0.0744 | 0.0734 | 0.0597 | 0.0597 | 0.0517 | 0.0400 | 0.0433 | 0.0512 | ... | 0.0484 | 0.0634 | 0.0634 | 0.0389 | 0.7907 | 1.0000 | 0.0 | 0.7907 | 1.0000 | 0.0 |
ZAK | 0.0604 | 0.0641 | 0.0659 | 0.0631 | 0.0597 | 0.0597 | 0.0454 | 0.0431 | 0.0477 | 0.0484 | ... | 0.0370 | 0.0390 | 0.0390 | 0.0408 | 0.6135 | 1.0000 | 0.0 | 0.6135 | 1.0000 | 0.0 |
303 rows × 213 columns
Normalized all
Data.get_pspa_all_norm()
-5P | -5G | -5A | -5C | -5S | -5T | -5V | -5I | -5L | -5M | ... | 5H | 5K | 5R | 5Q | 5N | 5D | 5E | 5s | 5t | 5y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
AAK1 | 0.0720 | 0.0245 | 0.0284 | 0.0456 | 0.0425 | 0.0425 | 0.0951 | 0.1554 | 0.0993 | 0.0864 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
ACVR2A | 0.0415 | 0.0481 | 0.0584 | 0.0489 | 0.0578 | 0.0578 | 0.0598 | 0.0625 | 0.0596 | 0.0521 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
ACVR2B | 0.0533 | 0.0517 | 0.0566 | 0.0772 | 0.0533 | 0.0533 | 0.0543 | 0.0442 | 0.0471 | 0.0516 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
AKT1 | 0.0603 | 0.0594 | 0.0552 | 0.0605 | 0.0516 | 0.0516 | 0.0427 | 0.0435 | 0.0464 | 0.0505 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
AKT2 | 0.0602 | 0.0617 | 0.0643 | 0.0582 | 0.0534 | 0.0534 | 0.0433 | 0.0418 | 0.0493 | 0.0513 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
KDR | 0.0634 | 0.0672 | 0.0556 | 0.0517 | 0.0541 | 0.0526 | 0.0427 | 0.0420 | 0.0428 | 0.0476 | ... | 0.0543 | 0.0653 | 0.0771 | 0.0509 | 0.0582 | 0.0414 | 0.0387 | 0.0335 | 0.0335 | 0.0406 |
FLT4 | 0.0457 | 0.0531 | 0.0488 | 0.0553 | 0.0512 | 0.0471 | 0.0432 | 0.0499 | 0.0474 | 0.0530 | ... | 0.0624 | 0.0564 | 0.0559 | 0.0537 | 0.0610 | 0.0620 | 0.0528 | 0.0600 | 0.0600 | 0.0464 |
WEE1_TYR | 0.0531 | 0.0640 | 0.0559 | 0.0560 | 0.0433 | 0.0435 | 0.0568 | 0.0571 | 0.0637 | 0.0562 | ... | 0.0585 | 0.1058 | 0.1658 | 0.0447 | 0.0495 | 0.0312 | 0.0365 | 0.0453 | 0.0453 | 0.0490 |
YES1 | 0.0677 | 0.0571 | 0.0537 | 0.0530 | 0.0527 | 0.0505 | 0.0435 | 0.0375 | 0.0400 | 0.0463 | ... | 0.0593 | 0.0662 | 0.0840 | 0.0559 | 0.0604 | 0.0422 | 0.0482 | 0.0374 | 0.0374 | 0.0411 |
ZAP70 | 0.0602 | 0.0880 | 0.0623 | 0.0496 | 0.0471 | 0.0514 | 0.0465 | 0.0380 | 0.0307 | 0.0526 | ... | 0.0484 | 0.0477 | 0.0290 | 0.0520 | 0.0537 | 0.0709 | 0.0710 | 0.0862 | 0.0862 | 0.0605 |
396 rows × 236 columns
(?) Combined PSPA
Data.get_combine()
-5P | -5G | -5A | -5C | -5S | -5T | -5V | -5I | -5L | -5M | ... | 4E | 4s | 4t | 4y | 0s | 0t | 0y | 0S | 0T | 0Y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
kinase | |||||||||||||||||||||
CK1A | 0.029499 | 0.106195 | 0.058997 | 0.008850 | 0.029499 | 0.020649 | 0.035398 | 0.029499 | 0.085546 | 0.061947 | ... | 0.124629 | 0.074184 | 0.029674 | 0.023739 | 0.800587 | 0.129032 | 0.070381 | 0.800587 | 0.129032 | 0.070381 |
CK1D | 0.047619 | 0.084942 | 0.082368 | 0.011583 | 0.029601 | 0.023166 | 0.052767 | 0.034749 | 0.047619 | 0.024453 | ... | 0.143969 | 0.067445 | 0.018158 | 0.032425 | 0.745174 | 0.213642 | 0.041184 | 0.745174 | 0.213642 | 0.041184 |
CK1E | 0.060386 | 0.099034 | 0.086957 | 0.024155 | 0.016908 | 0.024155 | 0.057971 | 0.028986 | 0.060386 | 0.024155 | ... | 0.168704 | 0.075795 | 0.012225 | 0.029340 | 0.729469 | 0.219807 | 0.050725 | 0.729469 | 0.219807 | 0.050725 |
CK1G1 | 0.034749 | 0.111969 | 0.073359 | 0.015444 | 0.023166 | 0.019305 | 0.023166 | 0.042471 | 0.046332 | 0.027027 | ... | 0.119691 | 0.061776 | 0.007722 | 0.054054 | 0.818533 | 0.146718 | 0.034749 | 0.818533 | 0.146718 | 0.034749 |
CK1G2 | 0.023055 | 0.086455 | 0.112392 | 0.011527 | 0.023055 | 0.031700 | 0.060519 | 0.031700 | 0.043228 | 0.025937 | ... | 0.127907 | 0.061047 | 0.017442 | 0.040698 | 0.835735 | 0.126801 | 0.037464 | 0.835735 | 0.126801 | 0.037464 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
VRK2 | 0.050454 | 0.049864 | 0.046913 | 0.047503 | 0.042266 | 0.042266 | 0.042266 | 0.037619 | 0.037545 | 0.038430 | ... | 0.033657 | 0.045816 | 0.045816 | 0.059786 | 0.294284 | 0.705716 | 0.000000 | 0.294284 | 0.705716 | 0.000000 |
WNK4 | 0.028356 | 0.040191 | 0.041420 | 0.041804 | 0.044571 | 0.044571 | 0.040267 | 0.048490 | 0.051333 | 0.044571 | ... | 0.029169 | 0.028236 | 0.028236 | 0.030803 | 0.455990 | 0.544010 | 0.000000 | 0.455990 | 0.544010 | 0.000000 |
YANK2 | 0.039266 | 0.047322 | 0.043125 | 0.040756 | 0.039266 | 0.039266 | 0.029314 | 0.031819 | 0.031074 | 0.031751 | ... | 0.022667 | 0.054912 | 0.054912 | 0.316183 | 0.387292 | 0.612708 | 0.000000 | 0.387292 | 0.612708 | 0.000000 |
YANK3 | 0.045607 | 0.056626 | 0.047212 | 0.043637 | 0.039769 | 0.039769 | 0.036632 | 0.039186 | 0.040937 | 0.039623 | ... | 0.043549 | 0.060827 | 0.060827 | 0.291806 | 0.526732 | 0.473268 | 0.000000 | 0.526732 | 0.473268 | 0.000000 |
YSK4 | 0.042943 | 0.052719 | 0.053878 | 0.053154 | 0.043233 | 0.043233 | 0.037439 | 0.028967 | 0.031356 | 0.037077 | ... | 0.036228 | 0.047455 | 0.047455 | 0.029117 | 0.441559 | 0.558441 | 0.000000 | 0.441559 | 0.558441 | 0.000000 |
390 rows × 213 columns
Reference data to calculate percentile
PSPA Ser/Thr score across human phosphoproteome
Data.get_pspa_st_pct()
kinase | AAK1 | ACVR2A | ACVR2B | AKT1 | AKT2 | AKT3 | ALK2 | ALK4 | ALPHAK3 | AMPKA1 | ... | VRK1 | VRK2 | WNK1 | WNK3 | WNK4 | YANK2 | YANK3 | YSK1 | YSK4 | ZAK |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | -10.960 | -0.581 | 0.329 | -3.891 | -3.591 | -5.312 | 0.814 | -0.559 | -0.933 | -2.607 | ... | -4.682 | -2.854 | -1.669 | -1.527 | -2.965 | -2.877 | -1.792 | -6.283 | -1.715 | -3.204 |
1 | -6.788 | -0.166 | 0.307 | -5.886 | -4.786 | -6.576 | 1.561 | -0.865 | -3.399 | -3.261 | ... | -5.670 | -2.817 | -4.071 | -3.394 | -5.097 | -1.874 | -1.480 | -8.709 | -3.708 | -6.093 |
2 | -9.031 | 1.232 | 1.775 | -6.164 | -5.446 | -8.330 | 0.778 | -1.355 | -0.929 | -4.998 | ... | -5.832 | -3.243 | -4.249 | -2.750 | -5.053 | 0.581 | -0.503 | -6.448 | -1.897 | -2.847 |
3 | -4.849 | 2.272 | 2.057 | -2.886 | -2.380 | -3.635 | 1.547 | 2.735 | -2.826 | -1.697 | ... | -2.758 | -1.699 | -1.725 | -0.091 | -0.673 | 0.313 | -0.207 | -2.316 | -0.054 | -1.118 |
4 | -6.597 | -1.388 | -0.956 | -2.834 | -3.794 | -4.969 | -1.862 | -1.717 | -2.653 | -3.515 | ... | -1.546 | -1.457 | -1.278 | 0.511 | -1.046 | -0.314 | -1.023 | -2.482 | -2.227 | -1.593 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
89779 | -7.310 | 3.942 | 3.002 | -6.844 | -4.643 | -7.518 | 2.899 | 3.402 | -1.935 | -3.813 | ... | -5.589 | -3.034 | -4.303 | -4.057 | -4.097 | 0.384 | -0.680 | -5.214 | -2.004 | -1.670 |
89780 | -8.009 | 1.134 | 1.012 | -4.002 | -2.917 | -4.024 | -0.157 | -0.607 | 0.828 | -4.222 | ... | -5.383 | -2.866 | -2.781 | -2.824 | -3.990 | 1.137 | 0.139 | -5.171 | -1.500 | -2.052 |
89781 | -0.940 | -2.553 | -2.435 | -5.031 | -3.635 | -6.779 | -0.982 | -0.106 | -3.507 | -4.883 | ... | -2.522 | -0.100 | -3.280 | -4.375 | -3.447 | -5.780 | -3.851 | -4.275 | -3.504 | -4.381 |
89782 | -3.753 | 1.451 | 1.883 | -5.583 | -5.253 | -7.164 | 1.226 | -0.399 | 3.341 | -5.932 | ... | -1.930 | -1.420 | -5.949 | -4.854 | -5.401 | -1.853 | -2.068 | -2.824 | -0.340 | -1.326 |
89783 | -1.540 | -2.180 | -2.014 | -2.416 | -0.592 | -1.364 | -3.320 | -0.826 | -4.438 | -1.393 | ... | -1.979 | -0.661 | -2.586 | -4.076 | -2.832 | -0.575 | -0.859 | -2.415 | -2.999 | -2.550 |
89784 rows × 303 columns
PSPA Tyr score across human phosphoproteome
Data.get_pspa_tyr_pct()
kinase | ABL1 | TNK2 | ALK | ABL2 | AXL | BLK | BMPR2_TYR | PTK6 | BTK | CSF1R | ... | NTRK3 | TXK | TYK2 | TYRO3 | FLT1 | KDR | FLT4 | WEE1_TYR | YES1 | ZAP70 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | -0.709617 | -3.624831 | -2.136338 | -0.022776 | -0.737589 | 2.345905 | 0.504821 | 2.417165 | -0.121611 | -1.205218 | ... | -0.368491 | 1.187208 | -1.601712 | -1.143748 | -0.891566 | -1.888643 | -1.758264 | -1.610344 | 4.545175 | 0.280174 |
1 | 0.986158 | -1.645273 | -1.183920 | 0.553010 | -1.098784 | -1.245678 | -0.276461 | -0.156496 | -1.322652 | -0.684989 | ... | -0.777541 | -0.385554 | -0.624216 | -0.737089 | -0.315447 | -1.293708 | -1.182827 | -1.891533 | -0.456570 | -2.465316 |
2 | -4.000671 | 0.543232 | -4.721913 | -3.662958 | -2.086910 | -6.134138 | -0.380569 | -2.595287 | -3.307418 | -2.386468 | ... | -6.363768 | -4.401061 | -1.096380 | -2.017356 | -2.000577 | -1.511887 | -1.844273 | 2.112679 | -3.783810 | -5.066184 |
3 | 1.496697 | 1.335568 | -1.360722 | 1.760211 | 1.016971 | -0.106255 | -0.547279 | -0.916277 | -0.572105 | -1.044687 | ... | -1.244076 | 1.742046 | -1.782387 | 0.598170 | -1.859460 | -1.254715 | -2.740284 | 0.392029 | -1.136538 | -0.588075 |
4 | -0.992936 | -1.729882 | -1.510540 | -0.906642 | -0.261331 | -0.977430 | 0.886090 | -0.460256 | 0.173188 | 0.970516 | ... | 1.234777 | 0.244627 | 0.108616 | 0.952371 | 0.615983 | 0.423058 | 0.148546 | 1.342049 | -0.721687 | 0.909419 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
7310 | -0.696420 | -1.151890 | -2.088003 | -0.799744 | -0.758219 | -3.110850 | -0.060125 | 0.637491 | -1.755881 | -1.128610 | ... | -1.532597 | -0.325391 | -1.102283 | -0.877872 | 0.152143 | -1.525713 | -1.607762 | -1.833375 | -2.765932 | -0.503243 |
7311 | 0.741063 | 0.847899 | 0.151897 | 0.858172 | 0.348263 | 0.187113 | 0.204329 | -0.822278 | -1.292904 | 0.546294 | ... | -0.195173 | -0.924637 | 0.086444 | 0.114326 | -2.471192 | -0.677875 | -1.549373 | -2.780892 | -1.075559 | 0.499488 |
7312 | -2.858631 | 0.269949 | -3.219751 | -2.741632 | -1.810684 | -5.456068 | -0.460094 | -2.476075 | -4.069701 | -1.687347 | ... | -6.592211 | -5.588492 | -2.446790 | -2.328961 | -2.656506 | -0.631762 | -2.268915 | -0.440015 | -3.189601 | -2.032972 |
7313 | 0.737694 | -0.477689 | -0.646850 | 0.928066 | 0.187149 | -1.000041 | -0.283551 | -3.053869 | -0.750475 | 0.132043 | ... | -0.122134 | -1.275022 | -0.020350 | 0.483620 | -0.060204 | 1.378042 | 0.573273 | -2.383657 | -0.246005 | 1.174693 |
7314 | 2.115113 | 0.153795 | 0.356357 | 1.846239 | -0.856035 | -0.422296 | -0.985140 | 0.554181 | 0.381133 | -1.666383 | ... | -1.670434 | 1.684176 | -0.508297 | -0.304215 | -2.045909 | -1.629804 | -2.227050 | -2.294855 | 0.428825 | -1.789086 |
7315 rows × 93 columns
Amino acids
Data.get_aa_info().head()
Name | SMILES | MW | pKa1 | pKb2 | pKx3 | pl4 | H | VSC | P1 | P2 | SASA | NCISC | phospho | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
aa | ||||||||||||||
A | Alanine | C[C@@H](C(=O)O)N | 89.10 | 2.34 | 9.69 | NaN | 6.00 | 0.62 | 27.5 | 8.1 | 0.046 | 1.181 | 0.007187 | 0 |
C | Cysteine | C([C@@H](C(=O)O)N)S | 121.16 | 1.96 | 10.28 | 8.18 | 5.07 | 0.29 | 44.6 | 5.5 | 0.128 | 1.461 | -0.036610 | 0 |
D | Aspartic acid | C([C@@H](C(=O)O)N)C(=O)O | 133.11 | 1.88 | 9.60 | 3.65 | 2.77 | -0.90 | 40.0 | 13.0 | 0.105 | 1.587 | -0.023820 | 0 |
E | Glutamic acid | C(CC(=O)O)[C@@H](C(=O)O)N | 147.13 | 2.19 | 9.67 | 4.25 | 3.22 | -0.74 | 62.0 | 12.3 | 0.151 | 1.862 | 0.006802 | 0 |
F | Phenylalanine | c1ccc(cc1)C[C@@H](C(=O)O)N | 165.19 | 1.83 | 9.13 | NaN | 5.48 | 1.19 | 115.5 | 5.2 | 0.290 | 2.228 | 0.037552 | 0 |
Rdkit features
Data.get_aa_rdkit().head()
MaxAbsEStateIndex | MinAbsEStateIndex | MinEStateIndex | qed | MolWt | MinPartialCharge | MaxAbsPartialCharge | FpDensityMorgan1 | FpDensityMorgan2 | FpDensityMorgan3 | ... | fr_Ar_N | fr_C_O | fr_NH0 | fr_NH1 | fr_NH2 | fr_SH | fr_imidazole | fr_priamide | fr_sulfide | fr_unbrch_alkane | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
aa | |||||||||||||||||||||
A | 9.574074 | 0.731481 | -0.962963 | 0.451352 | 89.094 | -0.480094 | 0.480094 | 2.000000 | 2.166667 | 2.166667 | ... | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
C | 9.756435 | 0.189815 | -1.004630 | 0.424382 | 121.161 | -0.480064 | 0.480064 | 2.000000 | 2.428571 | 2.428571 | ... | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
D | 9.846435 | 0.532407 | -1.294074 | 0.452021 | 133.103 | -0.481175 | 0.481175 | 1.444444 | 1.888889 | 2.000000 | ... | 0.0 | 2.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
E | 9.993880 | 0.023148 | -1.165509 | 0.485976 | 147.130 | -0.481229 | 0.481229 | 1.400000 | 1.900000 | 2.200000 | ... | 0.0 | 2.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
F | 10.378642 | 0.385093 | -0.959395 | 0.690463 | 165.192 | -0.480078 | 0.480078 | 1.416667 | 2.000000 | 2.500000 | ... | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
5 rows × 106 columns
Morgan features
Data.get_aa_morgan().head()
morgan_1 | morgan_11 | morgan_24 | morgan_27 | morgan_70 | morgan_74 | morgan_79 | morgan_80 | morgan_82 | morgan_116 | ... | morgan_1879 | morgan_1882 | morgan_1898 | morgan_1911 | morgan_1912 | morgan_1926 | morgan_1937 | morgan_1942 | morgan_1946 | morgan_1970 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
aa | |||||||||||||||||||||
A | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
D | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
E | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
F | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5 rows × 168 columns
Others
Number of random aa for each kinase when calculating PSPA score
# Data.get_num_dict()
="index") pd.DataFrame.from_dict(Data.get_num_dict(),orient
0 | |
---|---|
SYK | 18 |
PTK2 | 18 |
ZAP70 | 18 |
ERBB2 | 18 |
CSK | 18 |
... | ... |
YANK3 | 17 |
YSK1 | 17 |
ZAK | 17 |
EEF2K | 17 |
FAM20C | 17 |
396 rows × 1 columns