SIM FTP Database¶
Sistema de Informação sobre Mortalidade¶
[1]:
from pysus.ftp.databases.sim import SIM
sim = SIM().load() # Loads the files from DATASUS
[2]:
sim.metadata
[2]:
{'long_name': 'Sistema de Informação sobre Mortalidade',
'source': 'http://sim.saude.gov.br',
'description': ''}
[3]:
sim.groups
[3]:
{'CID10': 'DO', 'CID9': 'DOR'}
[4]:
sim.paths
[4]:
[/dissemin/publicos/SIM/CID10/DORES, /dissemin/publicos/SIM/CID9/DORES]
For more information about CID9 and CID10, visit http://tabnet.saude.es.gov.br/cgi/tabnet/sim/sim96/obtdescr.htm
Getting specific files¶
[5]:
sim.get_files("CID9", uf="SP", year=1995)
[5]:
[DORSP95.DBC]
[6]:
sim.get_files("CID10", uf=["SP", "RJ"], year=[2019, 2020, 2021])
[6]:
[DORJ2019.dbc,
DORJ2020.dbc,
DORJ2021.dbc,
DOSP2019.dbc,
DOSP2020.dbc,
DOSP2021.dbc]
[7]:
files = sim.get_files(["CID9", "CID10"], uf=["SP"], year=[1995, 2020])
sp_cid9, sp_cid10 = files
Describing a file inside DATASUS server¶
[8]:
sim.describe(sp_cid9)
[8]:
{'name': 'DORSP95.DBC',
'uf': 'São Paulo',
'year': 1995,
'group': 'CID9',
'size': '8.2 MB',
'last_update': '2020-01-31 02:48PM'}
[9]:
sim.describe(sp_cid10)
[9]:
{'name': 'DOSP2020.dbc',
'uf': 'São Paulo',
'year': 2020,
'group': 'CID10',
'size': '28.7 MB',
'last_update': '2022-03-31 04:19PM'}
Downloading files¶
You can rather download multiple files or download them individually:
[10]:
sim.download(sp_cid9) # Downloads to default directory
DORSP95.parquet: 100%|█████████████| 434k/434k [00:12<00:00, 36.0kB/s]
[10]:
[/home/bida/pysus/DORSP95.parquet]
[11]:
parquet = sp_cid9.download() # Or in a custom directory with `local_dir=`
parquet
[11]:
/home/bida/pysus/DORSP95.parquet
@Note: If the file has been downloaded already, it’s required to delete it in order to download the lastest updated file from DATASUS.
Reading files¶
PySUS uses Parquets as output, use the method to_dataframe()
to read the file as pandas DataFrame
[12]:
parquet.to_dataframe()
[12]:
contador | CARTORIO | REGISTRO | DATAREG | TIPOBITO | DATAOBITO | ESTCIVIL | SEXO | DATANASC | IDADE | ... | FONTINFO | ACIDTRAB | LOCACID | CRITICA | NUMEXPORT | CRSOCOR | CRSRES | RACACOR | ETNIA | UFINFORM | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 180001 | 951006 | 2 | 951002 | 2 | 1 | 19291003 | 465 | ... | 0 | 0 | 35 | |||||||||
1 | 180002 | 951006 | 2 | 951002 | 3 | 2 | 18980317 | 497 | ... | 0 | 0 | 35 | |||||||||
2 | 180003 | 951006 | 2 | 951003 | 2 | 2 | 19281002 | 467 | ... | 0 | 0 | 35 | |||||||||
3 | 180004 | 951006 | 2 | 951003 | 3 | 1 | 19110613 | 484 | ... | 0 | 0 | 35 | |||||||||
4 | 180005 | 951006 | 2 | 951004 | 1 | 1 | 19610914 | 434 | ... | 0 | 0 | 35 | |||||||||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
227832 | 179996 | 951004 | 2 | 951001 | 4 | 1 | 19380423 | 457 | ... | 0 | 0 | 35 | |||||||||
227833 | 179997 | 951004 | 2 | 951001 | 2 | 1 | 19470130 | 448 | ... | 0 | 0 | 35 | |||||||||
227834 | 179998 | 951004 | 2 | 951001 | 3 | 2 | 19160113 | 479 | ... | 0 | 0 | 35 | |||||||||
227835 | 179999 | 951006 | 2 | 951001 | 1 | 1 | 19550901 | 440 | ... | 0 | 0 | 35 | |||||||||
227836 | 180000 | 951006 | 2 | 951001 | 1 | 1 | 19700510 | 425 | ... | 0 | 0 | 35 |
227837 rows × 50 columns