[logo]

The CMASA Home Page


 

1. About CMASA

   The CMASA is a fast and accurate algorithm for detecting local protein structural similarity. It can predict protein function by comparing the query structure with functional sites database (recently, CSA is available). Furthermore, the CMASA can also detect the local structural similarity by searching the entire nrPDB or nrSCOP using the functional sites. 

If you use CMASA, please cite:

Li GH, Huang JF CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation. BMC Bioinformatics. 2010 Aug 27;11(1):439.

Thanks for all the members of the structural bioinformatics laboratory.KIZ.CAS.

 

2. Available (Use CMASA by downloading to local computer)

CMASA is written by C/C++, and now under GPL(GNU Lesser General Public License) license. So you can freely download and use CMASA(Recommended).

Download site: CMASA_download(~50M)

This file contains: Readme(how to install and use);

Sources files;

Executable files(Windows xp and 2003);

pre-complied linux executable file(when compiling failed in linux, you can try this);

Database: contains nrscop(14519 structures) and csa(15341 catalytic sites)

  For more information, please read README.txt or mail to ligonghua_88@yahoo.com.cn

 

3. Use CMASA by mail server

   We also support the mail service for using CMASA, but we do not recommend, because the mail server is not stable. 
Note the mail based database contains nrPDB(18757 structures) and nrCSA(1320 catalytic sites)

To use the mail-based CMASA, you must mail to cmasa@sohu.com. The sever will parse your email and run CMASA automatically, then automatically send the result to you as soon as possible.

  The format of the email is the follow:

The subject of your email must be contain the string of "cmasa",

For example:

       cmasa is lazy

       This is CMaSa

       1mct_CMASA

    The body of your email must contains the protein 3D coordination with the pdb or ent format and do not attach files. Other information will be ignored. The CMASA will automatically detect whether you want to do.

Generally, if the number of the query amino acid is >=10, CMASA will consider that you want to find the functional sites of the query PDB. So it will use your input structure to search CSA database(1320 catalytic sites). This process will be finished in 5~15 seconds.

However, if the number of the query amino acid is <10, CMASA will consider that you want to find the possible proteins structures which contains the query functional sites, so CMASA will search to entire nrPDB(18757 structures). This process will be finished in 1~3 minutes.
 

For example, you can paste the entire 3D coordinate of the 1mct chain A to find the functional sites. Note: if you paste multi chains in the email, CMASA will just read the first chain.

ATOM 1 N ILE A 16 -12.396 19.242 8.854 1.00 8.19 N
ATOM 2 CA ILE A 16 -12.881 18.808 10.195 1.00 9.20 C
ATOM 3 C ILE A 16 -12.916 17.275 10.181 1.00 10.50 C
ATOM 4 O ILE A 16 -13.607 16.689 9.344 1.00 9.60 O
ATOM 5 CB ILE A 16 -14.332 19.358 10.468 1.00 9.72 C
ATOM 6 CG1 ILE A 16 -14.341 20.905 10.454 1.00 8.90 C
ATOM 7 CG2 ILE A 16 -14.828 18.817 11.820 1.00 11.81 C
ATOM 8 CD1 ILE A 16 -13.682 21.580 11.631 1.00 14.04 C
ATOM 9 H1 ILE A 16 -11.464 18.839 8.662 1.00 0.00 H
ATOM 10 H2 ILE A 16 -13.079 18.918 8.134 1.00 0.00 H
ATOM 11 H3 ILE A 16 -12.343 20.286 8.813 1.00 0.00 H
ATOM 12 N VAL A 17 -12.137 16.656 11.070 1.00 9.38 N
ATOM 13 CA VAL A 17 -12.055 15.200 11.212 1.00 12.90 C
ATOM 14 C VAL A 17 -12.988 14.857 12.370 1.00 10.21 C
ATOM 15 O VAL A 17 -12.947 15.510 13.422 1.00 12.59 O
ATOM 16 CB VAL A 17 -10.582 14.713 11.563 1.00 10.78 C
ATOM 17 CG1 VAL A 17 -10.553 13.200 11.692 1.00 10.51 C
ATOM 18 CG2 VAL A 17 -9.604 15.126 10.490 1.00 10.11 C
ATOM 19 H VAL A 17 -11.625 17.179 11.713 1.00 0.00 H
ATOM 20 N GLY A 18 -13.846 13.854 12.167 1.00 11.01 N
ATOM 21 CA GLY A 18 -14.712 13.341 13.232 1.00 11.42 C
ATOM 22 C GLY A 18 -15.893 14.212 13.604 1.00 12.41 C
ATOM 23 O GLY A 18 -16.435 14.069 14.704 1.00 13.15 O
ATOM 24 H GLY A 18 -13.842 13.416 11.292 1.00 0.00 H
ATOM 25 N GLY A 19 -16.264 15.117 12.694 1.00 12.57 N
ATOM 26 CA GLY A 19 -17.400 15.989 12.925 1.00 12.46 C
ATOM 27 C GLY A 19 -18.636 15.481 12.203 1.00 15.15 C
ATOM 28 O GLY A 19 -18.702 14.305 11.830 1.00 15.96 O
ATOM 29 H GLY A 19 -15.743 15.219 11.869 1.00 0.00 H
ATOM 30 N TYR A 20 -19.593 16.343 11.933 1.00 12.41 N
ATOM 31 CA TYR A 20 -20.841 15.974 11.275 1.00 13.94 C
ATOM 32 C TYR A 20 -21.164 17.066 10.254 1.00 13.07 C
ATOM 33 O TYR A 20 -20.680 18.188 10.408 1.00 12.74 O
                 ...
                 ...
ATOM 1997 N ILE A 242 -10.971 48.494 12.739 1.00 13.90 N
ATOM 1998 CA ILE A 242 -12.218 48.762 12.030 1.00 18.41 C
ATOM 1999 C ILE A 242 -12.518 50.270 12.112 1.00 23.39 C
ATOM 2000 O ILE A 242 -12.913 50.885 11.116 1.00 24.45 O
ATOM 2001 CB ILE A 242 -13.357 47.934 12.659 1.00 18.62 C
ATOM 2002 CG1 ILE A 242 -13.118 46.452 12.333 1.00 14.48 C
ATOM 2003 CG2 ILE A 242 -14.707 48.403 12.150 1.00 20.39 C
ATOM 2004 CD1 ILE A 242 -14.082 45.518 13.032 1.00 15.59 C
ATOM 2005 H ILE A 242 -10.989 47.958 13.559 1.00 0.00 H
ATOM 2006 N ALA A 243 -12.270 50.869 13.273 1.00 21.12 N
ATOM 2007 CA ALA A 243 -12.548 52.290 13.469 1.00 28.37 C
ATOM 2008 C ALA A 243 -11.653 53.162 12.606 1.00 28.57 C
ATOM 2009 O ALA A 243 -12.047 54.288 12.307 1.00 60.40 O
ATOM 2010 CB ALA A 243 -12.339 52.707 14.912 1.00 21.66 C
ATOM 2011 H ALA A 243 -11.886 50.342 14.004 1.00 0.00 H
ATOM 2012 N ALA A 244 -10.471 52.680 12.247 1.00 24.31 N
ATOM 2013 CA ALA A 244 -9.539 53.472 11.470 1.00 20.55 C
ATOM 2014 C ALA A 244 -9.621 53.241 9.958 1.00 43.26 C
ATOM 2015 O ALA A 244 -8.894 53.936 9.231 1.00 37.20 O
ATOM 2016 CB ALA A 244 -8.138 53.171 11.964 1.00 17.82 C
ATOM 2017 H ALA A 244 -10.217 51.766 12.495 1.00 0.00 H
ATOM 2018 N ASN A 245 -10.460 52.321 9.470 1.00 21.44 N
ATOM 2019 CA ASN A 245 -10.460 51.936 8.058 1.00 20.81 C
ATOM 2020 C ASN A 245 -11.822 51.767 7.376 1.00 21.52 C
ATOM 2021 O ASN A 245 -11.836 51.331 6.206 1.00 42.02 O
ATOM 2022 CB ASN A 245 -9.671 50.634 7.917 1.00 14.43 C
ATOM 2023 CG ASN A 245 -8.203 50.688 8.272 1.00 23.35 C
ATOM 2024 OD1 ASN A 245 -7.772 50.324 9.364 1.00 30.24 O
ATOM 2025 ND2 ASN A 245 -7.344 51.115 7.381 1.00 32.78 N
ATOM 2026 OXT ASN A 245 -12.881 52.054 7.967 1.00 40.68 O
ATOM 2027 H ASN A 245 -11.150 51.923 10.044 1.00 0.00 H
ATOM 2028 HD21 ASN A 245 -6.395 51.113 7.632 1.00 0.00 H
ATOM 2029 HD22 ASN A 245 -7.697 51.458 6.539 1.00 0.00 H
TER 2030 ASN A 245

Or you can just paste functional sites of (H57-D102-H195) the 1mct to search other proteins which contains the similar functional sites

ATOM 341 N HIS A 57 -1.886 33.642 8.390 1.00 8.89 1MCT 443
ATOM 342 CA HIS A 57 -1.193 32.817 7.411 1.00 6.57 1MCT 444
ATOM 343 C HIS A 57 -2.100 32.525 6.210 1.00 10.35 1MCT 445
ATOM 344 O HIS A 57 -1.632 31.928 5.237 1.00 11.04 1MCT 446
ATOM 345 CB HIS A 57 -0.693 31.511 8.059 1.00 7.36 1MCT 447
ATOM 346 CG HIS A 57 -1.732 30.473 8.333 1.00 9.91 1MCT 448
ATOM 347 ND1 HIS A 57 -2.309 30.316 9.564 1.00 13.29 1MCT 449
ATOM 348 CD2 HIS A 57 -2.281 29.526 7.539 1.00 8.39 1MCT 450
ATOM 349 CE1 HIS A 57 -3.176 29.329 9.573 1.00 12.10 1MCT 451
ATOM 350 NE2 HIS A 57 -3.157 28.855 8.345 1.00 7.66 1MCT 452
ATOM 351 H HIS A 57 -2.148 33.305 9.277 1.00 0.00 1MCT 453
ATOM 352 HD1 HIS A 57 -2.115 30.859 10.362 1.00 0.00 1MCT 454
ATOM 353 HE2 HIS A 57 -3.714 28.161 7.992 1.00 0.00 1MCT 455
ATOM 782 N ASP A 102 -1.585 35.105 14.976 1.00 8.93 1MCT 884
ATOM 783 CA ASP A 102 -2.123 34.644 13.694 1.00 9.04 1MCT 885
ATOM 784 C ASP A 102 -3.579 35.105 13.447 1.00 12.62 1MCT 886
ATOM 785 O ASP A 102 -4.553 34.364 13.624 1.00 10.11 1MCT 887
ATOM 786 CB ASP A 102 -2.021 33.112 13.655 1.00 10.07 1MCT 888
ATOM 787 CG ASP A 102 -2.260 32.551 12.271 1.00 13.39 1MCT 889
ATOM 788 OD1 ASP A 102 -2.276 33.289 11.267 1.00 11.87 1MCT 890
ATOM 789 OD2 ASP A 102 -2.410 31.329 12.123 1.00 12.10 1MCT 891
ATOM 790 H ASP A 102 -1.556 34.458 15.714 1.00 0.00 1MCT 892
ATOM 1595 N SER A 195 -8.213 26.319 8.319 1.00 7.52 1MCT1697
ATOM 1596 CA SER A 195 -7.543 27.542 8.776 1.00 4.95 1MCT1698
ATOM 1597 C SER A 195 -8.366 28.808 8.600 1.00 8.59 1MCT1699
ATOM 1598 O SER A 195 -8.992 28.970 7.546 1.00 7.91 1MCT1700
ATOM 1599 CB SER A 195 -6.251 27.779 8.028 1.00 8.20 1MCT1701
ATOM 1600 OG SER A 195 -5.277 27.074 8.766 1.00 10.14 1MCT1702
ATOM 1601 H SER A 195 -7.696 25.677 7.786 1.00 0.00 1MCT1703
ATOM 1602 HG SER A 195 -4.476 27.030 8.281 1.00 0.00 1MCT1704

 

The parameters of the CMASA can be changed: if you understand how the CMASA was working, you can change one or more the CMASA parameters. To change the parameters, you just add several lines of the email body in anywhere. The format is parametername  = value ".for example:

MaxDistCa   = 3.2 (default is 3)

MaxDistSide = 4.2 (default is 4)

CrmsdCutoff =1.1 (default is 1.2)

Superposition=1 (1 represented that the result will be superposition, default is 0)

Superposition_cutoff = 1e-3 (default is 1e-4)